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1.  Purpose.  The  principal  purpose  of  this  ETL  is 
to  introduce  the  reader  to  geostatistical  technicpies 
and  to  demonstrate  their  basic  utility  with  respect 
to  HTRW  site  investigations.  The  ETL  also  will 
include  a  discussion  of  statistical  concepts  that 
support  the  science  of  geostatistics.  Practical 
aspects  of  geostatistical  techniques  will  be  dis¬ 
cussed  in  two  ways.  First,  practical  references  will 
be  made,  when  appropriate,  during  the  discussion 
of  statistical  concepts,  and  second,  examples 
describing  several  aspects  of  the  use  of  geosta¬ 
tistical  techniques  in  HTRW  site  investigations  will 
be  presented  and  discussed  in  a  section  of  this  ETL 
specifically  dedicated  to  providing  working  exam¬ 
ples.  This  ETL  also  will  include  a  brief  literature 
and  software  review;  review  of  geostatistical  appli¬ 
cations;  comparison  of  information  that  is  gener¬ 
ated  with  geostatistical  methods  to  that  information 
obtained  using  classical  statistical  methods;  and 
some  more  recent  geostatistical  methods,  such  as 
conditional  simulation. 


2.  Applicability.  This  letter  applies  to  all 
USAGE  commands  having  HTRW  investigation, 
design,  and  remedial  action  responsibility  within 
the  military  or  civil  works  programs. 

3.  References.  Documents  referenced  in  this 
ETL  are  hsted.  Appendix  A  contains  additional 
references  useful  in  geostatistical  application. 

a.  EM  200-1-2,  Technical  Project  Planning 
Guidance  for  HTRW  Data  Quality  Design. 


b.  ASTM  D-5922,  Standard  Guide  for 
Analj^is  of  Spatial  Variation  in  Geostatistical  Site 
Investigations. 

c.  ASTM  D-5549,  Standard  Guide  for 
Content  of  Geostatistical  Site  Investigations. 


4.  Distribution  Statement.  Approved  for  public 
release,  distribution  is  unlimited. 

5.  Discussion. 

a.  Geostatistics  is  a  powerful  tool  to  assess 
relationships  among  data  obtained  from  various 
locations.  It  allows  optimization  of  sample  spac¬ 
ing  and  frequency.  More  importantly,  geostatistics 
also  allows  one  to  effectively  estimate  parameter 
values  in  areas  between  actual  san5)le  points  and 
quantify  the  uncertainty  of  the  estimated  values. 
This  can  be  very  valuable  in  risk  management  and 
design  decision  making.  This  ETL  builds  upon  the 
principles  introduced  in  EM  200-1-2. 

b.  The  ETL  contains  exan^les  which  illus¬ 
trate  the  statistical  principles  discussed  throughout 
the  document.  Not  every  application  of  geosta¬ 
tistics  to  HTW  projects  could  be  illustrated,  how¬ 
ever,  and  the  user  must  be  aware  of  the  basic 
principles  and  seek  appropriate  appUcations.  Spe¬ 
cific  examples  of  typical  cost-effective  applications 
of  geostatistics  are  also  given  here. 
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(1)  Geostatistics,  by  the  construction  of  a  vari- 
ogram  based  on  preliminary  sampling,  can  be  used 
to  determine  the  typical  separation  of  sampling 
points  that  delineate  uncorreiated  data.  The 
range  of  the  variogram  is  used  as  a  basis  for  selec¬ 
ting  a  san^le  spacing  that  minimizes  costs  and 
provides  independent  data  for  determining,  for 
exanq)le,  average  exposure  values  for  risk  assess¬ 
ment.  First,  an  adequate  number  of  preliminary 
samples  are  analyzed  from  the  site  (refer  to  sec¬ 
tion  4-3).  Second,  a  variogram  is  constructed  using 
techniques  described  in  Chapter  4.  Third,  the  range 
of  the  variogram,  as  defined  in  section  2-3  is  deter¬ 
mined.  Lastly,  the  range  or  some  multiple  or  frac¬ 
tion  of  it,  is  chosen  for  future  sample  spacing.  The 
variogram  should  be  updated  as  new  data  are  col¬ 
lected.  For  example,  the  variogram  may  indicate 
data  spaced  more  than  200  ft  apart  are  uncorrelated. 
Closure  sampling  may  then  be  proposed  to  be 
spaced  every  200  ft  or  more  along  an  excavation. 
Smaller  spacing  results  in  unnecessary  duplication 
of  information  and  unneeded  expenditure  of  funds. 

(2)  Geostatistics,  through  block  kriging,  can 
yield  estimates  of  the  average  concentrations  to  be 
encountered  in  a  typical  daily  excavation  area/ 
volume.  For  applications  such  as  excavation  of  near 
surface  contamination,  two-dimensional  block 
kriging  could  be  used  to  estimate  mean  contaminant 
concentration  for  specific  excavation  areas. 

Although  this  document  does  not  address  three- 
dimensional  block  kriging  for  estimating  mean  con¬ 
centrations  within  given  volumes,  additional  guid¬ 
ance  and  tools  for  three-dimensional  kriging  are 
available  through  references  cited  in  Appendix  A. 
Alternatively,  one  can  use  two-dimensional  block 
kriging  to  estimate  mean  concentrations  in  different 
layers  within  a  given  volvune.  These  estimates  can 
then  be  averaged  to  approximate  the  overall  average 
concentration  within  the  entire  volume.  This 
assumes  adequate  data  exist  to  perform  the  two- 
dimensional  block  kriging  at  the  different  depths. 

To  perform  two-dimensional  block  kriging,  adequate 
site  characterization  data  are  collected  (refer  to 
section  4-4).  Second,  the  data  gathered  from  the 
areas  of  interest  are  used  to  construct  a  variogram, 
as  described  Chapter  4.  Third,  the  variogram  is 


modeled  as  described  in  section  4-6.  Lastly,  the 
model  is  used  to  perform  block  kriging,  as 
described  in  section  2-4  for  blocks  of  a  size  com¬ 
parable  to  the  daily  excavation  area/volume.  The 
Wock-kriged  values  can  then  be  used  for  estimating 
the  treatment  plant  loading,  etc.,  related  to  that 
block.  The  kriging  also  quantifies  the  possible 
variance  in  the  average  concentration  for  each 
block  that  can  be  used  to  manage  the  risk  of 
operating  a  treatment  plant. 

(3)  Exposure  concentrations  for  risk  assess¬ 
ment  purposes  can  be  cortqjuted,  using  geostatis¬ 
tics,  even  though  the  site  characterization  data  are 
somewhat  clustered  or  were  collected  using  biased 
sampling  strategies.  Assuming  the  data  are 
already  available  and  adequate  in  number  (refer  to 
section  4-4),  the  first  step  is  to  compute  a  san^le 
variogram,  as  described  in  Chapter  4.  Second,  the 
variogram  is  modeled  as  described  in  section  4-6. 
Next,  this  model  is  used  in  performing  a  block 
kriging  operation  over  the  inferred  exposure  area, 
as  described  in  section  2-3.  Finally,  die  block 
kriging  value  can  be  used,  along  with  the  kriging 
variance,  to  determine  the  exposure  point  con¬ 
centration,  assuming  the  data  were  normally 
distributed  (or  were  transformed  to  be  normally 
distributed). 

(4)  The  last  example  describes  the  use  of  geo¬ 
statistics  to  quantify  project  risk  for  excavation 
or  treatment  volumes.  Even  with  anq)le  site  char¬ 
acterization  point  data  (borings  or  wells),  the  limits 
of  the  treatment  zone  are  imperfectly  defined. 
Geostatistics  allows  one  to  evaluate  the  risk  that 
the  size,  and  therefore  cost,  of  the  remediation  may 
be  larger  or  smaller  than  expected.  First,  site  char¬ 
acterization  is  performed  and  adequate  data  are 
collected  (as  described  in  section  4-4).  Second,  the 
data  are  transformed  by  assigning  a  value  of  one  or 
zero,  depending  on  whether  the  value  is  above  or 
below,  respectively,  a  given  clean-up  value  or 
other  criteria.  Third,  the  transformed  data  are  then 
used  to  construct  a  variogram  as  described  in 
Chapter  4.  Fourth,  this  variogram  is  modeled  as 
described  in  section  4-6.  Next,  this  model  is  used 
in  performing  indicator  kriging  as  described  in  sec¬ 
tion  2-6.  The  kriging  estimates  essentially  reflect  a 
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probability  that  the  concentration  at  the  points  of 
estimation  exceed  the  clean-up  value  or  other  stan¬ 
dard.  These  kriging  estimates  can  be  contoured  to 
define  areas  or  volumes  of  material  that  have  a 
certain  likelihood  of  exceeding  some  cleanup  value. 
The  contour  value  is  essaitially  the  probability  of 
exceedance.  Lastly,  the  size  of  the  area  defined  by 
different  probabilities  of  exceedance  can  be  deter¬ 
mined  and,  using  a  unit  cost  or  similar  approach,  a 
cost-versus-risk  curve  can  be  developed.  This  can 
be  used  in  programming  money  for  the  project,  as  a 
basis  for  negotiating  cleanup  levels  with  regulators, 
or  to  help  determine  if  the  cost  and  time  of  addi¬ 
tional  characterization  work  will  be  offset  by  less 
risk  during  construction.  Alternatively,  rathw  than 
transforming  the  data  to  ones  and  zeros,  the  actual 
values  are  kriged  and  the  kriging  variances  can  be 
used  to  determine  prediction  intervals  on  each  esti¬ 
mated  value  as  described  in  section  2-6.  In  the 
vicinity  of  the  point  estimate,  these  prediction  inter¬ 
vals  can  be  used  to  define  the  spread  of  potential 
values  expected  within  a  given  probability.  This 
assumes  the  data  are  normally  distributed  or  have 
been  transformed  to  be  normally  distributed. 


6.  Actions  Required. 

a.  USAGE  elements  identified  in  paragraph  2 
shall  consider  applications  of  geostatistics  as 

FOR  THE  COMMANDER; 
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described  in  this  document  as  appropriate.  This  is 
particularly  true  during  planning  of  large-scale  site 
characterization  efforts  or  when  there  are  risk 
management  or  design  decisions  to  be  made  that 
must  consider  the  uncertainty  of  site  characteriza¬ 
tion  results.  The  same  USAGE  elements  should 
also  encourage  the  use  of  geostatistics,  where 
appropriate,  by  their  contractors. 

b.  USAGE  elements  shall  make  every  effort 
to  familiarize  staff  members  actively  supporting 
HTRW  projects  with  the  fundamentals  and  poten¬ 
tial  benefits  of  the  application  of  geostatistics. 

This  letter  is  a  good  starting  point  for  learning 
about  the  use  of  geostatistics  for  HTRW  projects. 
Users  are  encouraged  to  attend  appropriate 
training. 

c.  This  letter  sets  out  procedures  for  the  tech¬ 
nically  correct  application  of  geostatistics  which 
are  consistent  with  current  practice,  such  as  set 
forth  in  ASTM  D-5922  and  D-5549.  The  techni¬ 
cal  procedures  outlined  herein  shall  be  considered 
when  performing  USAGE  in-house  geostatistical 
analysis  or  reviewing  such  analyses  done  by 
USAGE  contractors. 


!;hief.  Environment^  Division 
fectorate  (Hf  Military  Pra^ams 
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Chapter  1 
introduction 


1-1.  General 

a.  This  Engineer  Technical  Letter  (ETL) 
addresses  the  use  of  geostatistics  at  hazardous, 
toxic,  and  radioactive  waste  (HTRW)  sites.  One 
very  fundamental  aspect  of  perhaps  all  HTRW  site 
investigations  that  deal  with  environmental  con¬ 
tamination  is  the  need  to  characterize  the  extent 
and  spatial  distribution  of  contamination.  Such  a 
characterization  typically  would  include  describ¬ 
ing,  using  a  variety  of  statistical  or  analytical  tools, 
spatial  trends  and  variability.  A  principal  diffi¬ 
culty  in  doing  this  is  the  fact  that  measurements 
may  be  few,  or  may  be  sparsely  scattered  over 
large  regions.  A  question  that  arises  naturally  in 
this  situation  is  how  one  might  interpolate  in  order 
to  make  predictions  (or  estimates)  at  points  where 
measurements  of  contaminant  concentration  are 
not  available.  Such  interpolation  will  be  referred 
to  as  point,  or  punctual,  estimation  in  this  ETL. 
Additionally,  an  investigator  may  need  to  deter¬ 
mine  a  single  representative  value  for  an  area  that 
is  represented  by  several  measured  or  estimated 
values  or  both;  this  will  be  referred  to  in  this  ETL 
as  block  estimation.  Geostatistics  is  a  set  of  sta¬ 
tistical  procedures  designed  to  accomplish  these 
ends.  Geostatistics  may  be  applied  to  many  prob¬ 
lems,  other  than  contamination,  that  occur  at 
HTRW  sites.  Even  though  this  document  addres¬ 
ses  only  twodimensional  applications,  geostatistics 
can  be  used  in  three  dimensions  as  well.  Indeed, 
there  are  many  cases  in  which  the  third  dimension, 
usually  stratification,  is  desirable  to  address. 

b.  Kriging  is  the  principal  geostatistical  meth¬ 
odology  described  in  this  ETL.  For  introductory 
purposes  kriging  can  be  defined  as  a  technique  for 
determining  the  optimal  wei^ting  of  measure¬ 
ments  at  sampled  locations  for  obtaining  predic¬ 
tions,  or  estimates,  at  unsampled  locations; 
additional  definition  of  kriging  is  provided  through¬ 
out  this  document.  Kriging  is  well-suited  for  mak¬ 
ing  point  and  block  estimates.  However,  much  of 
the  advantage  of  using  geostatistical  procedures. 


such  as  kriging,  lies  not  just  in  the  point  and  block 
estimates  they  provide,  but  in  the  information  they 
provide  concerning  uncertainty  associated  with 
these  estimates.  The  uncertainty  information  is 
usually  quantified  as  either  the  standard  deviation 
(or  variance)  associated  with  kriging  estimates  and 
is  referred  to  as  kriging  standard  deviation  (or 
kriging  variance)  in  this  ETL. 

c.  Original  geostatistical  work  involved 
making  estimates  for  the  areal  extent  and  concen¬ 
trations  of  economic  mineral  deposits,  in  relation  to 
mining.  Today  (1996),  geostatistical  techniques 
continue  to  have  a  function  in  mining.  However,  a 
well-developed  methodology  that  is  c£q)able  of 
interpolating  a  given  set  of  measured  values  at  dis¬ 
crete  locations  into  estimates  for  new  locations  or 
developing  an  individual  estimate  for  an  area 
including  many  locations,  or  both,  has  attracted 
users  from  many  disciplines,  and  there  is  a  trend 
toward  incorporating  geostatistics  as  standard  cur¬ 
riculum  for  most  geo-science  educational  pro¬ 
grams.  The  use  of  geostatistical  techniques  as  part 
of  HTRW  site  investigations  is  becoming  common 
because  of  the  almost  routine  need  for  data  inter¬ 
polation  as  part  of  these  investigations. 

d.  Once  investigators  have  established  that 
their  data  are  adequate  as  to  quality  and  quantity, 
geostatistics  can  provide  powerful  analytical  tools 
that  result  in  quantitative  characterization  of  areas 
of  special  interest  within  the  study  area  or  ttie 
entire  study  area.  These  characterizations  may 
address  spatial  variation;  for  example,  it  may  be 
determined  where  values  for  concentrations  of 
contaminants  in  soils  are  relatively  high  or  low,  are 
less  than  or  greater  than  a  specified  value,  or  even 
have  a  high  or  low  probability  of  exceeding  a 
certain  value. 


1-2.  Scope 

a.  The  scope  of  this  ETL  will  be  limited 
principally  to  discussions  and  examples  of  two- 
dimensional  point  and  block  estimations  using  a 
geostatistical  method  known  as  kriging.  The  ETL 
will  present  the  technical  aspects  of  geostatistics 
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through  discussion  of  the  assumptions  behind  and 
the  mechanics  of  several  types  of  kriging,  including 
ordinary  kriging,  which  is  applicable  when  the 
mean  for  die  variable  of  interest  is  constant  over 
the  region  of  interest,  and  universal  kriging,  which 
is  applicable  when  the  mean  for  the  variable  of 
interest  changes  gradually  over  the  region.  The 
discussion  also  will  address  a  specialized  form  of 
kriging  known  as  indicator  kriging  and  the  use  of 
information  concerning  uncertainty  associated  with 
kriging  estimates.  The  fundamental  concepts  of 
geostatistical  kriging  theory  will  be  provided  in  this 
ETL;  however,  references  will  be  provided  for 
additional  and  more  detailed  information. 

b.  The  practical  aspects  of  kriging  will  be  dis¬ 
cussed  through  categorical  examples  of  HTRW 
site  investigations.  The  phrase  “HTRW  site 
investigations,”  will  refer  to  planning,  analysis, 
and  remediation  implementation  phases  of  HTRW 
projects. 

c.  Additional  topics  included  in  this  ETL  such 
as  review  of  applications  and  of  some  of  the  newer 
geostatistical  techniques  will  be  limited.  The  intent 
will  be  to  familiarize  the  reader  with  these  topics 
and  not  to  provide  how-to  knowledge. 

1-3.  Organization 

a.  This  ETL  is  organized  into  seven  chapters. 
Chapter  1  is  introductory  and  includes  an  overview 
of  the  technical  a^ects  of  spatial  prediction  in 
general  and  certain  geostatistical  concepts.  Chap¬ 
ter  2  provides  a  detailed  discussion  of  assumptions 
and  theory  behind  kriging,  including  equations  and 
concepts  that  will  be  useful  to  investigators  who 
wish  to  gain  a  better  understanding  of  the  technical 
aspects,  or  mathematics,  of  kriging  interpolation. 
As  indicated,  many  of  the  concepts  developed  in 
Chapter  2  are  discussed  in  very  general  terms  in 
Chapter  1,  so  those  readers  desiring  only  an  over¬ 
view  of  kriging  concepts  may  wish  to  read  only 
Chapter  1  and  bypass  Chapter  2  altogether. 

b.  Chapter  3  provides  a  review  of  texts  fliat 
contain  much  more  detailed  information  regarding 


kriging  theory  than  material  included  in  Chapter  2. 
Chapter  3  also  provides  a  brief  generic  discussion 
of  kriging  software. 

c.  Chapter  4  provides  a  detailed  step-by-step 
discussion  of  variogram  construction  and  demon¬ 
strates  some  pitfalls  and  solutions  to  this  crucial 
process.  Chapter  4  also  discusses  methodologies 
which  investigators  may  use  to  evaluate  their 
variograms. 

d.  Chapter  5  provides  a  discussion  of  prac¬ 
tical  aspects  of  geostatistics  in  a  presentation  of 
several  example  kriging  applications  with  data 
from  the  HTRW  field.  The  examples  are  intended 
to  illustrate  a  few  of  the  many  different  ways 
kriging  can  be  used  in  HTRW  site  investigations 
and  are  not  presented  with  the  same  level  of  detail 
used  in  Chapter  4. 

e.  Chapter  6  provides  additional  detail  on 
some  crucial  aspects  of  kriging  applications  and 
includes  considerations  investigators  may  use  to 
help  determine  if  kriging  is  feasible  for  the  appli¬ 
cation  they  have  in  mind,  or  reviewers  can  deter¬ 
mine  if  the  application  of  geostatistics  was 
appropriate. 

f.  Chapter  7  provides  an  introduction  to  other 
methods  for  spatial  modeling.  This  section  also 
includes  discussion  of  advanced  stochastic  methods 
such  as  simulation. 


1  -4.  An  Overview  of  the  Use  of 
Geostatistics  in  Hazardous,  Toxic,  and 
Radioactive  Waste  Site  Investigations 

a.  General. 

(1)  HTRW  site  investigations  involve  complex 
administrative,  scientific,  and  engineering  func¬ 
tions  and  are  truly  interdisciplinary.  Scientists  and 
engineers,  for  instance,  may  be  confronted  with 
administrative  findings  or  directives,  associated 
with  fiscal,  managerial,  or  regulatory  input,  that 
may  either  guide  or  constrain  their  work.  In  a 
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likewise  fashion,  scientific  findings  may  define  the 
scope  of  administrative  effort. 

(2)  Scientists  and  engineers  involved  in 
HTRW  site  investigations  have  found  that  they 
have  an  implicit  need  for  many  disciplines  to  fixlfill 
the  objectives  of  each  particular  investigation. 
Frequently,  an  HTRW  site  investigation  will 
benefit  firom  input  from  earth-science  disciplines 
such  as  geology,  hydrogeology,  and  chemistry, 
among  others.  Some  HTRW  site  investigations  are 
large  enough  to  use  several  individuals  from  each 
of  these  disciplines,  as  well  as  many  others,  for  the 
duration  of  multi-year  investigations.  Most  disci¬ 
plines  associated  with  HTRW  site  investigations 
will  benefit  from  knowledge  or  input  fi'om  special¬ 
ized  and/or  interdisciplinary  branches;  the  geolo¬ 
gist,  for  example,  will  occasionally  benefit  from 
knowledge  of  geophysics.  Naturally,  interdisci¬ 
plinary  input  also  can  be  very  helpful,  especially  in 
geostatistics,  where  earth-science  disciplines  rely 
on  assistance  from  statisticians. 

(3)  In  this  ETL  and  for  its  purposes,  a  com¬ 
plete  HTRW  site  investigation  is  described  con¬ 
cerning  three  relatively  broad  sequential  activities 
or  phases.  These  phases  are  referred  to  as  initial 
planning,  analysis,  and  implementation  of  remedia¬ 
tion  plans.  Another  very  important  HTRW  site 
investigation  activity,  monitoring,  is  less  discrete 
and  is  a  part  of  all  three  phases.  Monitoring 
represents  the  basis  for  analysis,  is  often  modified 
as  a  result  of  analysis,  and  may  be  newly  imple¬ 
mented  as  part  of  remediation. 

(4)  Kriging  techniques  can  and  have  been  used 
in  any  of  the  three  phases.  Only  a  few  very  basic 
applications  of  kriging  techniques  are  described  in 
tWs  ETL.  The  intent  of  this  ETL  is  to  describe 
basic  concepts  so  that  more  elaborate  applications 
can  be  done  based  on  a  fundamental  understanding 
of  the  procedures  involved. 

(5)  For  examples  of  more  elaborate  applica¬ 
tions,  the  reader  can  refer  to  the  material  cited  in 
Chapter  3.  However,  the  best  applications  are 
developed  by  readers  who  have  a  clear  under¬ 
standing  of  the  goals  associated  with  each 


particular  HTRW  site  investigation  and  also  have 
a  good  basic  understanding  of  the  fundamental 
geostatistical  techniques.  As  alluded  to  here  and 
elsewhere  in  this  ETL,  there  are  many  techniques 
available  for  gridding  data;  kriging  has  an  added 
advantage  of  generating  kriging  standard  devia¬ 
tions  that  can  be  used  as  a  measure  of  uncertainty. 

b.  Initial  planning. 

(1)  Initial  planning  may  involve  several 
aspects  associated  with  implementing  or  operating 
a  monitoring  network;  it  also  may  involve  recon¬ 
naissance  evaluation  of  an  existing  network.  Addi¬ 
tionally,  because  monitoring  is  present  in  all 
phases  of  HTRW  site  investigations,  the  same 
opportunities  for  geostatistical  applications  asso¬ 
ciated  with  network  analysis  that  occur  in  the 
initial  planning  stages  may  occur,  perhaps  often, 
throughout  the  investigation.  The  information 
available  fi:om  kriging  standard  deviations  can  add 
much  to  sampling  or  monitoring  network  analysis. 

(2)  For  application  of  geostatistical  tech¬ 
niques,  the  most  likely  aspects  of  network  imple¬ 
mentation  and  operation  to  be  addressed  certainly 
include  network  design,  evaluation,  and  modifi¬ 
cation.  Geostatistics  offer  the  investigator  oppor¬ 
tunities  to: 

(a)  Locate  areas  where  existing  sampling  or 
monitoring  networks  may  provide  strong  or  weak 
estimates. 

(b)  Quantify  the  effect  of  increasing  or 
decreasing  the  sampling  or  monitoring  network 
density. 

(c)  Evaluate  the  effect  of  removing  or  relocat¬ 
ing  certain  monitoring  locations  or  adding  new 
locations  to  the  sampling  or  monitoring  network. 

c.  Analysis. 

(1)  Although  aspects  of  network  design  can  be 
quite  important  during  analysis,  the  investigator  is 
likely  to  be  concerned  principally  with  using  infor¬ 
mation  from  monitoring  networks  to  evaluate 
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environmental  conditions  throughout  the  specified 
study  area.  The  evaluations  may  require  either 
point  or  block  estimates.  Often,  design  factors  are 
addressed  in  the  analytical  phase  as  well. 

(2)  A  common  application  for  kriging  tech¬ 
niques  in  HTRW  site  investigations  is  estimating 
real  means.  More  common,  however,  is  estimating 
the  extent  of  areal  contamination.  Usually  these 
estimates  involve  chemicals  in  air,  water,  and  soil; 
however,  if  sufficient  information  is  available, 
such  estimates  could  include  a  wide  range  of 
environmental  factors  that  involve  many  issues 
other  than  contaminants.  Perhaps  the  most  com¬ 
mon  examples  concern  geologic  and  hydrologic 
factors,  such  as  depth  to  bedrock  and  groundwater- 
level  elevations.  Investigators  need  to  realize  that 
almost  any  set  of  measurements  can  be  distributed 
using  kriging  techniques,  providing  there  is  a 
sufficient  amount  and  distribution  of  measured 
information. 

(3)  The  investigator  also  needs  to  realize  that 
the  resultant  kriging  estimates  can  be  gridded. 

This  gridding  affords  investigators  opportunities  to 
perform  mathematical  or  logical  operations,  or 
both,  on  the  kriging  estimates,  provided  that 
investigators  are  comfortable  with  kriging  esti¬ 
mates.  Saturated  thickness  could,  for  example,  be 
calculated  fi*om  kriging  estimates  for  groundwater 
elevations  and  base  of  aquifer  elevations. 

(4)  Often,  after  preparing  estimates  for  areal 
properties,  the  investigator  may  appreciate  the 
opportunity  afforded  by  kriging  techniques  to  eval¬ 
uate  the  confidence  associated  with  the  estimates. 
Maps  of  kriging  standard  deviations  can  provide 
the  investigator  with  information  concerning  the 
confidence  associated  with  the  kriging  estimates. 
Although  the  areas  of  lowest  confidence  may  be 
well-known  intuitively,  maps  of  the  kriging  stan¬ 
dard  deviation  are  an  important  step  toward  quan¬ 
tification.  More  often  than  not,  even  the  most 
experienced  investigator  will  benefit  from  careful 
study  of  maps  of  kriging  standard  deviations. 

d.  Implementation  of  remediation. 


(1)  One  of  the  most  common  applications  for 
kriging  techniques  in  the  final  phases  of  HTRW 
site  investigations  is  evaluating  compliance.  For 
instance,  a  question  such  as  ‘Is  the  mean  concen¬ 
tration  of  constituent  x  within  compliance  limits?” 
is  ubiquitous  to  HTRW  site  investigations.  Mak¬ 
ing  determinations  concerning  compliance  is  very 
similar  to  estimating  areal  extent  as  part  of  the 
analysis.  Investigators  and  managers  have  much 
to  gain  from  the  confidence  information  available 
from  kriging  techniques  as  to  the  reliability  of 
estimates  as  well  as  in  optimizing  monitoring 
networks. 

(2)  Kriging  can  also  be  very  useful  if  man¬ 
agers  are  interested  in  making  decisions  based  on 
the  probability  of  certain  conditions  existing.  If  a 
condition  can  be  defined  by  the  manager,  then,  pro¬ 
viding  there  are  adequate  data,  indicator  kriging 
can  provide  an  estimate  for  the  probability  of 
existence.  A  common  example  of  this  kind  of 
application  is  making  areal  determinations  for 
probabilities  that  concentrations  for  a  constituent 
do  or  do  not  exceed,  for  example,  an  action  level. 

(3)  There  are  many  operational  remediation 
issues  that  kriging  techniques  may  address  as  well. 
Remedial  activities  at  HTRW  sites  often  need  esti¬ 
mates  for  amounts  in  general.  For  instance,  there 
could  be  a  need  for  information  regarding  volumes 
of  contaminants  to  be  treated,  volumes  of  soil  to  be 
excavated,  volumes  of  soil  to  be  stored,  and  so  on. 
By  combining  estimates  for  different  geologic, 
hydrologic,  and  chemical  factors,  estimates  for 
these  volumes  can  be  obtained  from  kriging  tech¬ 
niques  in  much  the  same  way  as  saturated  thick¬ 
nesses  can  be  calculated. 


1-5.  An  Overview  of  Some  Technical 
Aspects  of  Geostatistics 

The  purpose  of  this  section  is  to  provide  an 
overview  of  some  of  the  procedures  and  concepts 
to  be  treated  in  detail  in  this  ETL.  Some  of  the 
technical  ideas  and  terminology  will  be  introduced 
in  very  general  terms,  with  the  goal  of  orienting  the 
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reader  who  may  not  be  familiar  with  the  area  of 
geostatistics. 

a.  General  considerations  in  spatial 
prediction. 

(1)  The  principal  technical  issue  considered  in 
this  ETL  is  spatial  prediction  or  modeling  values 
of  a  spatial  process;  in  particular  it  is  considered 
how  best  to  make  use  of  measurements  of  a  vari¬ 
able  (such  as  pollutant  concentration)  at  sampled 
locations  to  make  inferences  (or  predictions)  about 
that  variable  at  unsampled  locations  or  about 
values  of  the  variable  for  the  region  as  a  whole. 

(2)  A  spatial  process  can  be  viewed  as  having 
a  large-scale  or  regional  component  and  a  smaller 
scale  or  local  component;  both  of  these  compo¬ 
nents  need  to  be  accoimted  for  when  modeling  a 
spatial  process.  The  large-scale  component  is 
referred  to  as  the  mean  field  and  is  most  often 
modeled  by  a  spatial  trend  which  may  or  may  not 
be  constant  over  the  region.  The  smaller  scale 
component  is  a  random  fluctuation  which  is  mathe¬ 
matically  combined  with  the  trend  to  make  up  the 
sample  at  a  point.  The  random  component  is 
usually  assumed  to  be  zero  on  the  average  but  can 
be  either  positive  or  negative  in  individual  samples. 
The  separation  of  the  trend  firom  the  random  com¬ 
ponents  is  problem-  and  scale-dependent  and 
requires  some  judgment  to  determine.  There  can 
be  several  “solutions”  to  fixe  problem  of  separating 
the  trend  and  random  components  that  may  be 
useful  for  various  geostatistical  purposes  when 
using  a  single  set  of  data. 

(3)  Local-scale  fluctuation  of  the  variable  of 
interest  (e.g.  water  levels  or  contaminant  concen¬ 
trations)  at  a  sample  point,  although  random,  can 
show  some  association  (i.e.  correlation)  with  the 
random  fluctuations  at  nearby  points.  This  is 
referred  to  as  spatial  correlation.  Positive  spatial 
correlation  between  measurements  means  that  the 
random  components  at  both  points  tend  to  have  the 
same  sign,  whereas  negative  correlation  means  the 
random  components  tend  to  have  opposite  signs. 
Both  the  “large-scale”  trend  and  tire  positive 
spatial  correlation  of  the  “local-scale”  fluctuations 


contribute  to  measurements  taken  at  locations  close 
together  being  more  closely  related  than  measure¬ 
ments  taken  farther  apart. 

(4)  The  most  obvious  way  one  might  proceed 
for  spatial  prediction  at  unsampled  locations  is 
simply  to  take  an  average  of  the  sample  values  that 
one  does  have  and  assume  that  this  value  gives  a 
reasonable  prediction  at  all  locations  in  the  region 
of  interest.  This  may  work  adequately  in  some 
cases,  but  one  can  also  see  the  pitfalls  in  doing 
this.  Using  a  single  value  for  an  entire  region 
makes  an  implicit  assumption  of  spatial  homo¬ 
geneity.  It  ignores  any  spatial  trends  that  might 
exist  in  the  data  and  it  also  ignores  spatial  conti¬ 
nuity.  If  it  is  known  that  the  variable  of  interest 
does  have  the  tendency  to  be  spatially  correlated, 
then  it  would  make  sense  to  use  a  weighted  average 
rather  than  a  simple  average  in  making  a  spatial 
prediction,  with  measurements  at  sampled  loca¬ 
tions  that  are  nearer  to  the  unsampled  location 
being  given  more  weight.  This  then  is  the  motiva¬ 
tion  for  the  geostatistical  methods  discussed  in  this 
ETL.  The  method  known  as  kriging,  which  is  the 
principal  subject  to  be  considered  here,  is  a  tech¬ 
nique  for  determining  in  an  optimal  manner  the 
weighting  of  measurements  at  sampled  locations 
for  obtaining  predictions  at  unsampled  locations. 
These  optimal  weights  depend  on  spatial  trends 
and  correlations  that  may  be  present. 

(5)  There  are  a  number  of  ways  to  go  about 
performing  spatial  prediction.  The  geostatistical 
method  of  kriging  covered  in  this  ETL  belongs  to  a 
class  of  methods  known  as  stochastic  mefliods.  In 
these  methods,  it  is  assumed  that  the  measure¬ 
ments,  both  actual  and  potential,  constitute  a  single 
realization  of  a  random  (or  stochastic)  process. 

One  advantage  of  assuming  the  existence  of  such  a 
random  process  is  that  measures  of  uncertainty, 
such  as  Ae  variance  used  in  kriging,  can  be 
defined.  These  measures  of  uncertainty  permit 
objective  assessment  of  the  performance  of  a 
spatial  prediction  technique  on  the  basis  of  how 
small  such  measures  are.  Once  a  measure  of 
uncertainty  has  been  selected,  the  weights  to  be 
used  in  spatial  prediction  may  be  determined  so  as 
to  exphcitly  minimize  file  measure  of  uncertainty. 
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In  short,  flie  use  of  stochastic  techniques  provides 
the  investigator  with  a  way  of  objectively  quanti¬ 
fying  errors  and  determining  weights.  In  practice, 
spatial  predictions  obtained  using  kriging  are 
almost  always  accompanied  by  a  measure  of  the 
associated  error.  Most  kriging  practitioners 
consider  such  an  error  evaluation  to  be  an  integral 
part  of  the  analysis,  and  point  to  error  analysis  as 
one  of  the  principal  advantages  of  using  kriging  (or 
stochastic  techniques  in  general)  over  other 
procedures. 

(6)  Nonstochastic  techniques,  on  the  other 
hand,  are  typically  applied  strictly  empirically, 
with  no  assumptions  concerning  the  existence  of  an 
underlying  random  process  and  with  no  theoretical 
framework  with  which  to  evaluate  statistically  the 
performance  or  optimality  of  the  techniques. 

When  they  are  applied  in  such  a  maimer,  it  is  not 
possible  to  evaluate  in  advance  whether  such  a 
procedure  would  be  expected  to  yield  results  that 
are  satisfactory.  Two  techniques  that  are  com¬ 
monly  applied  in  a  nonstochastic  setting  are  simple 
averaging,  mentioned  above,  and  trend  analysis, 
which  is  a  least-  squares  method  for  fitting  a 
smooth  surface  to  the  data.  Even  though  these 
techniques  are  usually  applied  nonstochastically,  it 
is  still  possible  to  assess  ^eir  performance  if  a 
stochastic  setting  is  assumed.  Loosely  speaking 
(these  ideas  are  discussed  more  precisely  in  Chap¬ 
ter  7),  simple  averaging  would  perform  well  if 
there  is  no  trend  and  no  spatial  correlation,  and 
trend  analysis  would  perform  well  if  there  is  a 
trend  that  can  be  modeled,  but  no  spatial  correla¬ 
tion.  Lack  of  correlation  in  the  observations  is  one 
assumption  that  is  made  in  ordinary  statistical 
regression  analysis,  and  in  fact  trend  analysis,  if  it 
is  placed  in  a  stochastic  setting,  is  actually  one 
special  type  of  regression.  The  stochastic  method 
of  kriging  explicitly  incorporates  the  spatial  corre¬ 
lations  \ndiich  are  ignored  in  trend  analysis.  In 
Chapter  7,  a  few  other  common  techniques  that  are 
usually  applied  in  a  nonstochastic  setting  will  be 
discussed  briefly.  Most  of  fliese  techniques  are 
designed  to  incorporate  the  notion  of  spatial  con¬ 
tinuity,  but  the  way  it  is  incorporated  may  be 
subjective.  Kriging  provides  an  objective  means  of 
incorporating  the  presence  of  spatial  correlation 


and  makes  expUcit  die  background  assumptions 
that  are  being  made. 

b.  Important  geostatistical  concepts.  Below 
are  some  of  die  key  ideas  in  geostatistics  that  will 
be  given  detailed  attention  in  this  ETL.  They  are 
introduced  in  much  the  same  order  that  they  are 
discussed  in  Chapter  2,  where  more  detail  is 
presented. 

(1)  Variograms. 

(a)  A  central  idea  in  geostatistics  is  the  use  of 
spatial  correlation  to  improve  spatial  predictions, 
or  interpolations.  The  variogram  is  the  principal 
tool  used  to  characterize  the  degree  of  spatial 
correlation  present  in  the  data  and  is  fundamental 
to  kriging.  The  correlation  between  measurements 
at  two  points  is  usually  assumed,  as  described 
above,  to  depend  on  die  separation  between  the  two 
points.  Values  for  all  possible  pairings  of  sample 
points  can  be  examined  by  squaring  the  difference 
between  the  values  in  each  pair.  The  squared 
differences  are  then  categorized  according  to  die 
distance  separating  the  pair.  For  small  separa¬ 
tions,  or  lags,  the  squared  differences  are  usually 
small  and  increase  as  the  lag  increases.  A  plot  of 
the  squared  differences  per  sample  pair  as  a  func¬ 
tion  of  lag  is  referred  to  as  die  sample  variogram. 

(b)  The  general  behavior  of  the  sample  vari¬ 
ogram  points  relates  to  the  spatial  correlation 
between  sample  sites  and  can  provide  investigators 
with  qualitative  information  about  the  spatial  pro¬ 
cess,  but  in  order  to  use  this  information  in  a  math¬ 
ematically  rigorous  manner  as  a  basis  for  inter¬ 
polation,  a  function  with  specific  properties  must 
be  fit  to  the  sample  variogram  points.  The  fit,  as 
with  all  curve-fitting  procedures,  takes  the  scat¬ 
tered  points  and  passes  a  smooth  curve  through  the 
points.  The  curve,  which  can  be  represented  by  a 
mathematical  expression  or  function,  is  called  a 
model.  Several  named  models  with  characteristic 
features  introduced  in  Chapter  2  are  commonly 
used  in  geostatistics.  The  resultant  variogram 
model  is  used  to  determine  kriging  weights  for  use 
in  interpolation. 
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(2)  Directional  variogram  and  anisotropy.  It 
is  often  the  case  that  spatial  correlation  depends 
not  only  on  distance  between  points,  but  also  on 
direction.  For  example,  measurements  at  pairs  of 
points  100  m  apart  with  the  line  between  them 
oriented  in  a  north-south  direction  may  have  a 
different  correlation  than  measurements  at  points 
the  same  distance  apart  but  with  the  line  joining 
them  oriented  in  an  east-west  direction.  The 
spatial  process  is  said  to  exhibit  anisotropy,  and 
what  is  known  as  a  directional  variogram  must  be 
used  for  the  geostatistical  analysis. 

(3)  Kriging  and  kriging  variance. 

(a)  Kriging  yields  optimal  spatial  estimates  at 
points  where  no  measurements  exist  in  terms  of  the 
values  at  points  where  one  does  have  data.  As 
discussed  above,  placing  the  problem  in  a  sto¬ 
chastic  framework  permits  precision-defining 
optimality.  In  kriging,  the  restriction  is  first 
imposed  that  the  predicted  value  at  any  point  is  a 
linear  combination  of  the  measured  values;  that  is, 
the  kriging  estimate  is  a  linear  predictor.  Given 
fliis  restriction,  the  values  of  the  coefficients  in  this 
linear  function  are  chosen  so  as  to  force  fee  pre¬ 
dictor  to  be  optimal. 

(b)  The  first  criterion  imposed  is  feat  the 
estimate  be  unbiased,  or  that  m  an  average  sense 
fee  difference  between  fee  predicted  value  and 
actual  value  is  zero.  The  second  optimality  cri¬ 
terion  is  that  fee  prediction  variance  be  minimized. 
This  variance  is  a  statistical  error  measure  defined 
to  be  fee  average  squared  difference  between 
predicted  and  actual  values.  Because  fee  kriging 
estimate  minimizes  this  variance,  it  is  known  as  fee 
best  (minimum  variance)  rmbiased  linear  predictor. 
This  minimization  is  performed  algebraically  and 
results  in  a  set  of  equations  known  as  fee  kriging 
equations,  which  give  an  explicit  representation  of 
fee  optimal  coefficients  (weights)  in  terms  of  fee 
variogram.  The  form  of  these  equations  is  pre¬ 
sented  in  Chapter  2. 

(c)  Also  given  in  Chapter  2  is  an  expression 
for  fee  kriging  variance.  This  variance  depends  on 
geometry  of  fee  data  sites,  wife  the  variance  at 


locations  near  points  wife  measurements  tending  to 
be  smaller.  One  can  then  associate  with  any  spa¬ 
tial  prediction  a  variance,  which  gives  an  indi¬ 
cation  of  the  uncertainty  in  that  predicted  value. 

As  mentioned  before,  this  measure  of  uncertainty 
gives  kriging  one  of  its  principal  advantages  over 
many  other  techniques. 

(4)  Trends  and  universal  kriging.  Special 
attention  must  be  given  in  kriging  to  fee  question 
of  whether  there  are  spatial  trends  in  fee  data.  A 
trend  in  this  case  is  usually  any  detectable  ten¬ 
dency  for  fee  measurements  to  change  as  a  fimc- 
tion  of  fee  coordinate  variables  but  can  also  be  a 
function  of  other  explanatory  variables.  For 
example,  aside  fi'om  random  fluctuations,  measure¬ 
ments  of  groundwater  elevations  may  exhibit  a  ten¬ 
dency  to  increase  in  a  consistent  manner  fee  farther 
one  proceeds  in  a  certain  direction.  A  kriging 
analysis  in  which  there  is  no  spatial  trend  is  known 
as  ordinary  kriging;  when  a  trend  does  exist,  uni¬ 
versal  kriging  should  be  considered.  In  universal 
kriging,  one  attempts  to  account  for  fee  trends 
present.  For  example,  it  might  be  assumed  feat  fee 
trend  can  be  represented  as  a  linear  function  of 
coordinate  variables.  The  form  of  fee  trend  model 
is  then  incorporated  into  fee  universal  kriging 
equations  to  obtain  the  optimal  weights. 

(5)  Block  kriging.  What  has  been  discussed  in 
fee  preceding  paragraphs  is  usually  known  as 
point,  or  punctual,  kriging.  In  point  kriging,  fee 
goal  is  to  predict  fee  value  of  a  variable  at  discrete 
locations.  By  contrast,  in  block  kriging  fee  goal  is 
to  predict  fee  average  value,  over  a  specified 
region,  of  a  variable.  As  in  point  kriging,  fee  opti¬ 
mal  predictor  is  a  linear  combination  of  fee  mea- 
sxjred  data  values,  and  degree  of  uncertainty  is 
indicated  by  a  block  kriging  variance.  Block 
kriging  variances  tend  to  be  smaller  than  point 
kriging  variances  because  averages  tend  to  be  less 
variable  than  individual  values. 

(6)  Prediction  intervals  and  normality. 

(a)  A  standard  kriging  analysis  will  give  two 
values  for  any  location;  fee  optimal  kriging  esti¬ 
mate  and  fee  kriging  variance.  The  variance 
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provides  a  measure  of  uncertainty  for  the  predic¬ 
tion.  In  some  cases,  it  may  be  desirable  to  go  even 
further  in  specifying  the  nature  of  the  uncertainty 
than  simply  giving  the  variance.  One  way  to  pro¬ 
ceed  is  to  try  to  obtain  what  is  known  as  a  predic¬ 
tion  interval.  Here  one  seeks  an  interval  such  that 
diere  is  a  certain  probability,  typically  95  percent, 
fliat  the  actual  value  lies  in  this  interval. 

(b)  Finding  such  an  interval  often  hinges  on 
having  knowledge  of  the  probability  distribution  of 
flie  variables  being  sampled.  One  ideal  situation  is 
when  the  variable  of  interest,  e.g.,  contaminant 
concentration,  can  be  assumed  to  have  a  normal 
distribution.  In  this  case,  given  the  set  of  measured 
values,  a  potential  value  at  an  imsampled  location 
has  a  normal  distribution  with  mean  given  by  the 
kriging  estimate  and  variance  given  by  the  kriging 
variance.  It  is  thus,  using  classical  statistics, 
straightforward  to  use  this  normal  distribution  to 
obtain  a  95  percent  prediction  interval  for  concen¬ 
tration  at  the  unsampled  location. 

(7)  Transformations.  Having  a  prediction 
interval  will  generally  be  much  more  informative 
flian  simply  having  the  kriging  estimate  and  kriging 
variance,  which  explains  why  investigators  often 
ask  wheftier  normality  assumptions  can  be  made 
for  their  data.  When  a  normality  assumption 
cannot  be  made,  it  is  sometimes  possible  to  find  a 
transformation  that  will  make  the  data  normal,  or 
nearly  so.  For  example,  a  transformation  that  is 
often  tried  is  the  logarithmic  transformation.  That 
is,  one  simply  takes  the  logarithm  of  all  data  values 
(assuming  they  are  >  0)  and  performs  the  geosta- 
tistical  analysis  on  these  transformed  values  rather 
than  on  the  original  data.  Prediction  intervals 
obtained  using  transformed  values  can  be  readily 
converted  to  corresponding  intervals  on  untrans¬ 
formed  variables.  There  are,  however,  subtleties 
ftiat  must  be  considered  in  back-transforming  the 
kriging  estimate  and  the  kriging  variance;  these  are 
discussed  in  more  detail  in  Chapter  2. 

(8)  Indicator  kriging. 


(a)  In  indicator  kriging,  analysis  is  performed 
using  what  are  known  as  indicator  variables  rather 
than  the  measured  data  themselves.  An  indicator 
variable  is  thus  a  special  kind  of  transform  of  the 
measured  data  and  can  have  only  two  possible 
values:  0  or  1.  To  obtain  the  indicator  variables 
to  be  analyzed,  first  specify  a  threshold  value,  say 
c,  which  may  represent,  for  example,  a  contami¬ 
nant  concentration  level  which  is  of  particular 
importance.  At  each  measurement  location,  the 
indicator  variable  is  then  assigned  a  value  of  1  if 
the  measured  value  is  less  than  or  equal  to  c,  and  is 
assigned  a  value  of  0  if  the  measured  value  is 
greater  than  c.  This  kind  of  transform  will  allow 
censored  data,  or  data  reported  as  less  than  some 
reporting  limit,  to  be  included  in  the  analysis  if  the 
reporting  limit  is  less  than  or  equal  to  the  cutoff 
value  of  c.  After  the  indicator  transform  has  been 
performed,  the  kriging  analysis  is  performed  using 
these  indicator  variables  in  the  same  manner  dis¬ 
cussed  above;  first  a  variogram  is  obtained,  and 
the  kriging  equations  yield  the  optimal  linear  pre¬ 
dictor  and  the  kriging  variance  for  the  indicators. 

(b)  Whereas  the  indicator  kriging  analysis  is 
done  using  only  O’s  and  I’s,  the  interpolated  esti¬ 
mates  are  not  restricted  to  these  two  values.  In 
most  cases  the  estimates  are  between  0  and  1, 
which  is  interpreted  to  be  the  probability  that  the 
actual  value  is  less  than  or  equal  to  the  threshold  c. 
Performing  this  analysis  for  a  number  of  different 
threshold  values,  c,  can  give  the  investigator  infor¬ 
mation  about  the  probability  distribution  of  con¬ 
taminant  values  at  a  location,  which  may  in  turn  be 
used  to  obtain  prediction  intervals.  As  discussed 
above,  such  intervals  may  even  be  more  valuable 
than  having  only  the  optimal  predictor  and  vari¬ 
ance  provided  by  the  usual  kriging  analysis,  partic¬ 
ularly  if  behavior  of  extremes  may  be  of  interest  to 
the  investigator.  The  advantage  of  using  indicator 
kriging  to  obtain  prediction  intervals  is  that  it  is 
not  necessary  to  assxjme  a  distribution  for  the  data, 
as  in  the  discussion  of  normality  above. 
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Chapter  2 

Technical  Aspects  of  Geostatistics 


2-1.  General 

a.  This  chapter  provides  technical  aspects  or 
the  necessary  theoretical  background  for  under¬ 
standing  kriging  applications.  Emphasis  will  be 
placed  on  presentation  of  the  basic  ideas;  long 
formulas  or  derivations  are  kept  to  a  minimum. 
Statistical  terms  that  are  commonly  used  in 
geostatistical  applications  will  be  highlighted  with 
bold  text  and  briefly  defined  as  they  are  intro¬ 
duced;  notation  used  in  fliis  ETL  is  also  tabulated 
in  Appendix  B.  The  reader  who  wishes  a  more 
thorough  discussion  of  these  fundamental  concepts 
may  consult  the  references  cited  in  Qiapter  3. 
Previous  exposure  to  engineering  statistics  at  die 
level  of  Devore  (1987)  and  Ross  (1987)  would  be 
helpful  in  imderstanding  some  parts  of  tiiis  chap¬ 
ter.  Readers  with  limited  statistical  experience 
may  wish  to  briefly  scan  this  chapter  and  refer 
back  to  it  after  reading  the  remaining  chapters. 

b.  In  section  2-2,  regionalized  random  vari¬ 
ables  are  discussed.  Regionalized  random  varia¬ 
bles  constitute  the  random  process  that  is  sampled 
to  obtain  the  observed  data  that  are  available  for 
analysis.  Basic  ideas  related  to  probability  distri¬ 
butions,  means,  variances,  and  correlation  are 
introduced.  The  variogram,  which  is  the  funda¬ 
mental  tool  used  in  geostatistics  to  analyze  spatial 
correlation,  is  introduced  in  section  2-3.  In  sec¬ 
tion  2-4  how  kriging  is  used  to  obtain  the  best 
weights  for  spatial  prediction  is  discussed,  and 
how  the  mean  squared  prediction  error  for  these 
predictions  is  computed  is  also  shown.  Section  2-5 
deals  briefly  with  co-kriging,  which  is  prediction  of 
one  variable  based  not  only  on  measurements  of 
that  variable  but  on  measurements  of  other  vari¬ 
ables  as  well.  Finally,  section  2-6  shows  how 
kriging  may  be  applied  to  determine  not  just  opti¬ 
mal  spatial  predictions  but  also  probabilities 
associated  with  various  events,  such  as  extreme 
events  that  may  be  of  importance  in  risk-based 
analyses. 


2-2.  Regionalized  Random  Variables 

a.  General. 

(1)  Suppose  the  extent  of  grormdwater  con¬ 
tamination  of  a  particular  pollutant  over  a  given 
study  area  is  being  determined.  To  simplify  the 
presentation,  all  data  are  assumed  to  be  distributed 
over  a  two-dimensional  region.  In  three- 
dimensional  groundwater  flow  systems,  one  could 
study  the  depth-averaged  concentration  of  a  pol¬ 
lutant  or  the  concentration  of  the  pollutant  in  a  par¬ 
ticular  horizontal  stratum  of  the  flow  system.  Let 
a  vector  x=fM,v^  denote  an  arbitrary  spatial  loca¬ 
tion  in  the  study  area.  Unless  oflierwise  stated,  it 
will  be  assumed  throughout  the  ETL  that  u  is  the 
east-west  coordinate  and  v  is  the  north-south 
coordinate  (Figure  2-1).  Denote  by  a  meas- 
luement  at  location  x,  such  as  the  concentration  of 
a  pollutant.  The  ultimate  goal  of  an  investigator 
would  be  to  determine  z(^  for  all  locations  in  the 
study  area.  However,  without  explicit  knowledge 
of  the  flow  and  transport  field,  this  goal  cannot  be 
achieved.  Therefore,  suppose,  instead,  that  the 
goal  is  to  estimate  the  values  of  z(^  with  a  given 
error  tolerance.  In  other  situations,  small  estima¬ 
tion  error  over  some  parts  of  the  study  area  (for 
instance,  near  a  domestic  water  supply)  may  need 
to  be  obtained,  while  allowing  larger  estimation 
errors  in  other  parts  of  the  study  area.  The  theory 
of  regionalized  random  variables  is  designed  to 
accomplish  these  goals. 

(2)  In  the  regionalized  random  variable  theory, 
the  true  measurement  z(^  is  assumed  to  be  the 
value  of  a  random  variable  Z(x).  Associating  a 
random  variable  Z(^  with  a  true  measurement  z(x) 
is  done  for  the  purpose  of  characterizing  the  degree 
of  uncertainty  in  the  quantity  of  interest  at  point  x. 
If  there  is  no  actual  measurement  taken  at  x,  then 
the  values  taken  on  by  Z(x)  represent  “potential” 
measurements  at  x;  that  is,  Z(^  represents  possible 
values  that  might  be  expected  if  a  measurement 
were  taken  at  x.  Because  there  is  uncertainty  asso¬ 
ciated  with  Z(^,  it  needs  to  be  characterized  by  a 
probability  distribution,  defined  by  P  [Z(i)  ^  c] 
where  P  denotes  probability  and  c  is  any  constant. 
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This  distribution  is  a  function  of  c,  and,  to  be  com¬ 
pletely  defined,  needs  to  be  known  for  all  values  of 
c.  The  distribution  is  used  to  make  evaluations 
such  as:  suppose  diat  we  have  no  measurement  of 
concentration  of  a  certain  contaminant  at  x,  but  the 
distribution  is  known,  and  a  threshold  value  of 
c  =  8  mg/1  is  of  interest.  If  P  [Z  (i)  :£  8]  =  0.60 , 
then,  if  a  measurement  were  made  at  x,  there  is  a 
60-percent  chance  of  obtaining  a  value  less  than  or 
equal  to  8  mg/1.  The  distribution  also  may  be  used 
to  calculate  other  probabilities,  such  as  the  proba¬ 
bility  of  obtaining  a  value  in  some  specified 
interval. 

(3)  An  important  concept  to  keep  in  mind  in 
all  geostatistical  applications  is  the  support  of  the 
regionalized  random  variable.  The  support  of  Z(^ 
is  the  in  situ  geometric  unit  represented  by  an 
individual  sample.  For  example,  in  a  soil  contami¬ 
nation  study,  sample  Z(^  might  represent  the  con¬ 
centration  of  a  contaminant  in  a  vertical  soil  core 
0. 1  m  in  diameter  and  1  m  in  length,  and  centered 
at  location  x.  Thus,  even  though  Z(^  is  defined  at 
a  particular  point,  it  is  representative  of  a  volume 
of  soil.  Changing  the  support  of  Z(i)  will  usually 
change  its  probability  distribution.  Therefore,  the 
observations  in  a  geostatistical  analysis  should  all 
have  the  same  support.  The  method  called  point, 
or  punctual,  kriging,  described  in  section  2-4,  is 
designed  to  predict  values  of  Z(^  with  the  same 
support  as  the  sample  data. 

(4)  A  concept  closely  related  to  support  is  that 
of  estimation  block,  which  is  a  geometric  unit 
larger  than  the  support  of  a  single  observation,  for 
which  a  single  representative  value  is  desired.  For 
example,  in  the  above  soil  contamination  study,  it 
may  be  necessary  to  estimate  the  average  concen¬ 
tration  of  the  contaminant  in  a  tmckload  of  soil 
excavated  from  a  block  6  m  long,  6  m  wide,  and 
0.3  m  thick.  Using  a  method  called  block  kriging, 
also  described  in  section  2-4,  the  block  average  can 
be  predicted  based  on  individual  measurements. 

(5)  Although  the  distribution  of  Z(x)  com¬ 
pletely  characterizes  Z(x)  at  any  particular  loca- 

Figure  2-1.  Diagrams  showing  A,  hypothetical  tion,  this  distribution  indicates  nothing  about  the 

study  area;  B,  stationary  covariance  function;  and  relations  among  the  values  of  Z(^  at  different 
C,  isotrophic  covariance  function 
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locations,  which  is  very  important,  because  geo¬ 
statistics  is  based  on  using  a  measurement  of  a 
regionalized  variable  at  one  location  to  gain  infor¬ 
mation  about  values  of  the  variable  at  another 
location.  The  notion  of  distribution  of  Z(^  at  a 
single  location  is  readily  generalized  to  two  or 
more  locations.  For  two  locations,  if  we  let  and 
Xz  be  two  distinct  locations,  then  the  joint  proba¬ 
bility  distribution  is  defined  to  be  the  probability 
P  [Z  (xi)  iCi,Z(x2)^  C2]  for  any  constants  c,  and 
C2.  This  latter  probability  means  the  probability 
that  both  Z  (xi)  ^  C]  and  Z  (X2)  ^  C2.  If  the  vari¬ 
ables  Z(x,)  and  Zfe)  are  statistically  independent 
of  one  another,  then  the  joint  probability  distri¬ 
bution  can  be  obtained  as  the  product  of  the  indi¬ 
vidual  probability  distributions, 

P  [Z  (x  )  ^  c^,  Z  (x  )  ^  Cj] 

(2-1) 

=  P  [Z  (i,)  ^  c,]  P  [Z  (x^)  ^  cj 

However,  in  most  applications,  Z(xi)  and  Zfe)  will 
not  be  statistically  independent  and  their  joint 
distribution  cannot  be  obtained  from  the  individual 
distributions.  When  this  joint  distribution  descrip¬ 
tion  is  applied  to  more  than  two  locations,  specifi¬ 
cation  of  the  full  spatial  distribution  of  Z  would 
require  knowing  the  joint  distribution  of  Z(xi), 

Z{x„)  for  any  set  of  n  spatial  locations  and  for  any 
n;  however,  except  in  very  special  cases,  working 
with  the  full  set  of  distribution  functions  of  Z(^  is 
not  feasible  and  is  not  done. 

(6)  To  simplify  the  problem  even  further,  vari¬ 
ous  parameters  of  the  distributions  are  usually 
considered  rather  than  dealing  with  the  entire  dis¬ 
tributions.  The  parameter  most  commonly  used  to 
characterize  a  distribution  is  the  mean,  or,  because 
the  mean  in  geostatistical  applications  depends  on 
the  spatial  variable  ^  the  mean  may  be  called  the 
spatial  mean,  or  the  drift.  In  statistics,  the  mean  is 
referred  to  as  the  expectation  (E)  of  the  random 
variable  Z(y ,  and  the  symbol  m  is  used  in  this 
report  to  denote  this  expectation  Thus, 

p(£)  =  E  [  Z  (I)  ]  (2-2) 


is  used  to  denote  the  mean,  or  expected  value,  of 
the  bracketed  term,  in  this  case  Z(x).  It  is  intui¬ 
tively  helpful  to  think  of  the  expectation  as  an 
average.  In  fact,  if  the  distribution  of  Z{x) 
assigned  equal  probability  to  a  finite  number  of 
values,  then  the  expectation  of  ZQc)  would  indeed 
be  the  simple  average  of  these  numbers.  In  geo¬ 
statistics,  however,  Z(x)  is  usually  assumed  to  take 
on  any  value  in  a  continuous  range  of  possible 
values,  rather  than  being  limited  to  a  discrete  set  of 
values.  In  this  case,  calculus  needs  to  be  used  to 
define  the  expectation.  The  following  example 
illustrates  the  difference  between  averages  and 
expectations. 

b.  Example  L 

(1)  An  experiment  consists  of  injecting  a  con¬ 
servative  tracer  at  a  particular  well  in  a  steady- 
state  groundwater  flow  system  and  measuring  the 
concentration,  Z^Qc),  of  the  tracer  in  a  neighboring 
well  24  hr  later.  The  tracer  is  then  allowed  to  flush 
from  the  system,  and  the  experiment  is  repeated  a 
second  time  to  obtain  another  concentration  mea¬ 
surement,  Z2(x),  at  the  same  location.  If  this 
process  is  repeated  n  times,  n  concentration  mea¬ 
surements  Zi(x),  Z2(x), ...,  Z„(x)  would  be  obtained, 
all  at  location  x.  The  average  concentration  level 
at  location  x  is 

Z„  (x)  =  1  (z,00  +  z^  (x) 

n  ^  (2-3) 

-  ...  -  (i)) 

which  would  change  depending  on  n  and  on  the 
actual  values  obtained  for  Z,(x),  Z2(x)^...,  Z„(x). 
However,  in  the  limit  as  n  increases,  Z^  (x) 
becomes  closer  and  closer  to  the  true  mean,  or 
expected,  concentration 

(s)  ■*  M.  (i)  as  n  increases  (2-4) 

This  theoretical  limit  is  a  constan^alue,  or  popu¬ 
lation  parameter,  as  opposed  to  Z^(i) ,  which  is  a 
random  variable,  or  a  property  of  the  particular 
sample  that  is  taken. 
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(2)  In  example  1,  no  assumptions  were  needed 
concerning  wheflier  die  mean  changed  with  spatial 
location,  because  all  sampling  was  done  at  one 
sampling  location  x.  In  most  HTRW  applications, 
die  mean  will  probably  change  depending  on  the 
sampling  location.  In  addition,  usually  only  one 
observation  is  available  at  any  particular  location. 
Therefore  some  assumptions  regarding  the  struc¬ 
ture  of  p(x)  must  be  made.  For  example,  it  is 
sometimes  appropriate  to  assume  p  (i)  =  p  is 
constant  for  all  x,  in  which  case  Z(^)  is  said  to 
have  a  stationary  mean.  Data  which  have  no 
underlying  trend  such  as  hydraulic  conductivity  in 
a  homogeneous  aquifer,  for  example,  might  be 
assumed  to  have  a  constant  mean.  If  the  mean  is 
constant,  it  makes  sense  to  estimate  it  with  the 
sample  average  of  n  observations  taken  at  different 
spatial  locations  X|,  X2, ...,  x„ 


=  1  [Z  +  Z^) 

.....  z  fe;] 


(2-5) 


However,  in  contrast  to  example  1,  defined  in 
this  way  may  not  get  closer  to  p  as  n  gets  large. 
Because  of  the  possible  spatial  correlation  in  the 
data,  the  size  of  the  sampling  region  must  be  large 
in  relation  to  the  correlation  length  in  order  for 
to  accurately  estimate  p. 


(3)  In  addition  to  the  mean  of  Z(^,  its  varia¬ 
bility  or  dispersion  is  also  of  interest,  and  this 
variability  is  most  commonly  measured  by  the 
(spatial)  variance,  defined  to  be  the  mean  of 
squared  deviations  of  Z(i)  from  p(x)  and  denoted 
by  a^(x). 


o2  00  =  E  [  (Z  (^)  -  p  00)"]  (2-6) 


The  (spatial)  standard  deviation  o(^  is  the 
square  root  of  the  variance.  The  following  exam¬ 
ple  illustrates  the  difference  between  the  popula¬ 
tion  variance,  which  has  been  defined  above,  and  a 
sample  variance. 

c.  Example  2. 


(1)  If  the  scenario  presented  in  example  1  is 
again  used,  the  sample  variance  of  the  n 
measurements  could  be  computed  as  follows: 


n 


E 


(z,  (i)  -  Z„  Qi)f 


(2-7) 


This  number  gives  a  measure  of  dispersion  of  the 
Z,(x)  values  from  their  sample  mean .  The  sample 
variance  depends  on  n  and  on  the  particular  values 
observed  for  Z,(x),  Z2QC), ...,  Z„(x).  However,  in 
the  limit  as  n  increases,  S„^(x)  gets  closer  and 
closer  to  a  constant  value,  which  is  denoted  by 
o^(x).  In  this  case,  a^(x)  is  a  population  param¬ 
eter,  and  S„  is  a  random  variable. 

(2)  The  mean  and  variance  can  both  be  calcu¬ 
lated  from  the  probability  distribution  of  Z(^. 
Again,  in  geostatistics,  the  relations  among  region¬ 
alized  variables  at  different  locations  are  of 
interest.  From  flie  joint  distribution  of  Z(xi)  and 
Z(X2)  the  (spatial)  covariance  function, 

C  (^,,  X  )  =  E  [  (Z  (i )  -  p  (x)) 

^  '  '  (2-8) 

(zu,)  -  P  (a^))] 

may  be  obtained.  This  function  has  a  key  role  in 
geostatistical  analyses.  It  is  a  measure  of  associ¬ 
ation  between  values  obtained  at  point  Xi  and  those 
obtained  at  point  X2.  If  values  at  these  two  spatial 
locations  tend  to  be  greater  than  average  or  less 
than  average  at  the  same  time,  then  the  covariance 
will  be  positive.  However,  if  the  values  vary  in  the 
opposite  direction  (that  is,  one  tends  to  be  larger 
than  average  when  the  other  is  less  than  average, 
and  vice  versa),  the  covariance  will  be  negative. 

(3)  Because  C(ii,X2)  is  an  unknown  population 
parameter,  it  too  must  be  estimated  using  a  sta¬ 
tistic  computed  from  sample  data.  To  make  this 
possible,  it  is  often  assumed  that  the  covariance 
function  depends  only  on  the  distance  between 
points,  which  is  defined  as  the  lag  h,  and  not  on 
their  relative  location  or  orientation. 
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c  (i,,  £^) 

h 


C{h), 


(2-9) 


/(«!  -  «2)^  +  (Vj  -  V/ 


Under  this  assiunption,  C(h)  can  be  estimated  by 
pooling  all  pairs  of  observations  that  are  approxi¬ 
mately  h  units  apart  and  computing  a  sample 
covariance  function 

C  Qt)  =  average 

-  Z)(Z(4.)  -  z)  :  (2-10) 

h  -  /^h  <  hy  <  h  +  A/i| 

where  hij  is  the  distance  between  and  Xj  and  the 
average  is  over  all  pairs  of  points  such  that  h^j  is 
between  h-Ah  and  h+Ah.  The  distance  h  is  called 
the  lag  and  Ah  is  called  the  lag  tolerance.  There 
are  more  effective  ways  to  estimate  C(h)  other  than 
using  Equation  2-10;  for  example,  see  Isaaks  and 
Srivastava  (1989).  However,  because  the  empha¬ 
sis  in  this  ETL  is  on  the  variogram  (to  be  defined 
below)  rather  than  the  covariance  fimction,  we  will 
not  need  to  use  the  estimated  covariance  function. 

(4)  A  covariance  function  is  called  stationary 
if  it  does  not  depend  on  the  origin  of  the  coordinate 
system,  that  is, 

C  +  h,  +  h)  =  C  (i|,  x)  (2-11) 

for  any  given  vector,  b  (Figure  2-1).  The  covari¬ 
ance  function  (Equation  2-9)  is  stationary  because 
changing  the  origin  does  not  change  the  distance 
between  the  points.  Substituting  Xx-X2-x  'v^ 
Equation  2-9  yields 

C  (i,  X)  =  c  (0)  (2-12) 

which,  combined  with  the  definitions  in  Equa¬ 
tions  2-6  and  2-8,  becomes 

=  C  (0)  for  all  x  (2-13) 
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Therefore,  when  Z(^  has  a  stationary  covariance 
function,  the  variance  of  Zfx)  is  constant  for  all  x. 
The  covariance  function  can  then  be  standardized 
by  dividing  it  by  the  variance.  The  resulting 
dimensionless  function  of  h  is  called  the  spatial 
correlation  function, 

p  (A)  =  (2-14) 

C  (0) 

The  correlation  fimction  is  a  scale-independent 
measure  of  linear  association  between  values  of  Z 
at  different  locations.  The  spatial  correlation  is 
always  between  -1  and  +1 ,  with  a  value  of  zero 
indicating  no  linear  association. 

(5)  In  addition  to  being  stationary,  the  covari¬ 
ance  function  in  Equation  2-9  has  another  import¬ 
ant  property.  It  is  also  isotropic,  or  omni¬ 
directional,  because  it  does  not  depend  on  the 
direction  between  the  two  locations.  In  many 
HTRW  applications,  the  correlation  between 
values  of  Z  at  two  locations  is  a  function  of  direc¬ 
tion  as  well  as  lag.  For  example,  contaminant 
concentrations  in  a  groundwater  flow  system  might 
be  more  highly  correlated  along  a  transect  in  the 
direction  of  flow  than  along  a  transect  perpen¬ 
dicular  to  the  flow.  In  that  case,  the  covariance 
function  depends  on  both  the  lag  h  and  the  angle  a 
between  locations, 

c  (i|,  2L)  =  C  (h,  a), 

h  =  ^(M,  -  u^f  +  (v,  -  V2)^(2.i5) 
a  =  atan 


Here,  a  is  the  angle  measured  counterclockwise 
fi’om  flie  east  direction  (Figure  2-1).  In  many  geo- 
statistical  publications  or  computer  packages,  the 
angle  may  be  defined  as  clockwise  firom  the  north 
direction,  so  care  should  be  taken  in  defining  the 
appropriate  angle  in  any  application.  A  covariance 
function  satisfying  Equation  2-15  is  called  aniso¬ 
tropic,  or  multi-directional. 
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(6)  To  summarize,  the  basic  model  fiame- 
work  that  will  be  used  throughout  the  ETL  is  the 
following:  the  value  of  a  measurement  z(^  (con¬ 
centration,  porosity,  hydraulic  head,  and  so  on)  at 
location  x  of  a  two-dimensional  region  is  the  value 
of  a  regionalized  random  variable,  Z(^,  with  mean 
p(x)  and  stationary  covariance  function  C(h,a). 
Other  assumptions  may  be  added  in  the  applica¬ 
tions  sections  to  analyze  specific  data  sets,  but  this 
framework  will  be  the  basic  framework  from 
which  many  of  die  results  will  be  derived.  In  some 
situations,  die  covariance  stationarity  assumption 
may  be  relaxed,  for  instance,  when  using  the  linear 
variogram  described  in  the  next  section. 

2-3.  Variograms 

a.  Regionalized  random  variables  differ  from 
classical  (ordinary  least-squares)  regression 
models  in  that  the  residuals,  defined  as  the  devi¬ 
ations  of  the  regionalized  random  variable  from  its 
mean  and  denoted  by 

Z*  (4  =  Z  (i)  -  n(x)  (2-16) 

are  related  to  one  another,  whereas  the  residuals  in 
a  regression  model  are  generally  assumed  to  be 
independent.  Thus,  in  the  regionalized  random- 
variable  model,  observed  values  of  the  residuals 
from  sampled  locations  contain  valuable  informa¬ 
tion  when  predicting  the  value  of  at  unsam¬ 
pled  sites.  The  relationship  among  the  residuals 
can  be  understood  by  examining  the  variogram, 
which  is  a  tool  that  is  widely  used  in  geostatistics 
for  modeling  the  degree  of  spatial  dependence  in  a 
regionalized  random  variable.  Although  the  vario¬ 
gram  is  closely  related  to  the  covariance  function, 
tiiere  are  some  important  differences  between  the 
variogram  and  covariance  function  that  will  be 
described  below.  The  covariance  function,  and 
related  correlation  function,  are  more  commonly 
used  in  basic  statistics  courses  than  the  variogram, 
so  many  readers  may  be  more  familiar  with  the 
former  concepts.  However,  the  variogram  is  more 
widely  used  in  geostatistics,  and  because  of  this  we 


will  adopt  the  variogram  as  the  primary  tool  for 
analyzing  spatial  dependence  in  the  remainder  of 
this  ETL. 

b.  As  was  the  case  with  the  covariance  func¬ 
tion,  it  is  necessary  to  distinguish  between  the 
theoretical  variogram,  which  is  a  population 
parameter,  and  the  sample  variogram,  which  is  an 
estimator  of  die  theoretical  variogram  obtained 
from  observed  data.  The  theoretical  variogram 
of  a  regionalized  random  variable,  y(xi  ^2)  is 
defined  as  one  half  of  the  variance  of  the  difference 
between  residuals  at  locations  Xi  aadx^: 

Y(£,,  =  |var  [Z  *  (i^)  -  Z  (2-17) 

Because  the  residuals  have  been  mean-centered,  as 
shown  in  Equation  2-16,  they  have  a  mean  of  zero. 
Therefore,  using  the  well-known  formula  for  the 
variance  of  a  random  variable  X 

Var  (A)  =  E  -  (EL!0^  (2-18) 

it  is  seen  that  Equation  2-17  is  equivalent  to 

Y  (i,,  ^)  =  I  E  [Z*  (i,)  -  (2-19) 

The  theoretical  variogram  is  always  non-negative, 
with  a  small  value  of  g  indicating  that  the  residuals 
at  locations  x,  andx2  tend  to  be  close  and  a  large 
value  of  A  indicating  that  the  residuals  tend  to  be 
different.  Equation  2-19  is  sometimes  called  a 
semi-variogram,  because  of  the  multiplication  by 
'A,  but  will  be  referred  to  in  this  ETL  as  a 
variogram. 

c.  It  would  be  ideal  to  know  the  theoretical 
variogram  before  taking  observations,  but  unfortu¬ 
nately,  it  must  be  estimated  using  sample  data.  To 
facilitate  variogram  estimation,  it  is  usually 
assumed  in  a  similar  manner  to  the  covariance 
function  that  y  depends  only  on  tiie  lag. 
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Y  (£,.  ij)  =  Y 

h  =  ^(u^  -  +  (v,  - 


(2-20) 


or  possibly,  on  the  lag  and  angle  between  locations 


Y  (i,.  ij)  =  Y  iK  a), 

h  =  (2-21)  ■ 


(Figure  2-1).  Equation  2-20  is  called  an  isotropic 
variogram  and  Equation  2-21  is  a  directional 
variogram  at  angle  a. 

d.  For  the  isotropic  case,  the  sample,  or 
empirical,  variogram  is  obtained  by  averaging  the 
square  of  all  computed  differences  between  resid¬ 
uals  separated  by  a  given  lag: 


y  (h)  =  ^  ave  |  (z*  (i^) 


(2-22) 


h  -  Ah <hij<h  +  Ah^ 

where,  as  before,  hy  is  the  distance  between  ^  and 
Xj.  For  a  given  h  as  more  and  more  points  sepa¬ 
rated  by  distance  h±  Ah  are  sampled  and  as  Ah 
gets  small,  y  (h)  should  approach  the  theoretical 
variogram.  More  detail  on  variogram  estimation 
will  be  presented  in  Chapter  4,  including  the 
directional  case.  In  this  section,  it  will  be  suffi¬ 
cient  to  describe  some  general  properties  of  iso¬ 
tropic  variograms  that  will  be  referred  to  numerous 
times  in  the  application  sections  to  follow. 


e.  A  plot  of  the  sample  variogram  versus  h 
often  has  a  considerable  degree  of  scatter  (Fig¬ 
ure  2-2),  which  is  especially  evident  if  the  sample 
size  n  is  small.  However,  the  points  can  usually  be 


fitted  by  a  smooth  curve  that  represents  a  theoret¬ 
ical  variogram  selected  from  a  suite  of  possible 
choices.  Usually,  the  theoretical  variogram  is 
monotonically  increasing,  signifying  that  flie  far¬ 
ther  two  observations  are  apart,  the  more  their 
residuals  tend  to  differ,  on  average,  from  one 
another.  Several  properties  common  to  many 
theoretical  variograms  are  shown  in  Figure  2-2.  If 
the  variogram  either  reaches  or  becomes  asymp¬ 
totic  to  a  constant  value  as  h  increases,  that  value 
is  called  die  sill  (Figure  2-2).  The  distance  (value 
of  h)  after  which  the  variogram  remains  at  or 
close  to  the  sill  is  called  the  range.  Measurements 
whose  locations  are  farther  apart  than  the  range  all 
have  the  same  degree  of  association.  Often,  a 
variogram  will  have  a  discontinuity  at  the  origin, 
signifying  that  even  measurements  very  close 
together  are  not  identical.  Such  variation  in  the 
measurements  at  small  scales  is  called  the  nugget 
effect.  The  size  of  the  discontinuity  is  called  the 
nugget.  Although  the  nugget  effect  is  sometimes 
confused  with  measurement  error,  there  is  a  subtle 
difference  between  diese  two  concepts  that  will  be 
explained  in  section  2-4.  A  simple  monotonic 
function  is  usually  selected  to  approximate  die 
variogram.  Four  such  functions  that  are  often  used 
in  practice  are: 

the  exponential  variogram  (parameters:  sill,  s  > 

0;  nugget,  0  <  g  <  s;  range,  r  >  0) 


yQi) 


g  +  (s-g) 

1  -  exp 

0,  h  =  Q 


(2-23) 


the  spherical  variogram  (parameters:  sill,  ^  >  0; 
nugget,  0  <  g  <  5;  range,  r  >  0) 


y{h)  -  1g+(s-g) 


0, 


h>r 

^0<h<.r 

/i  =  0 


(2-24) 
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Figure  2-2.  Diagram  showing  variogram  and  features 


the  Gaussian  variogram  (parameters;  sill,  5  >  0; 
nugget,  0  <  g  <  5 ;  range,  r  >  0) 


g  +  (s-g) 

1  -  exp 

[-3(A)1 

,h>0 

\r)  ) 

0. 

h=0 

(2-25) 


and,  the  linear  variogram  (parameters;  nugget, 
g  >  0;  slope,  b>0) 


Y(A)  = 


\g^bh, 

[  0, 


^>0 


^  =  0 


(2-26) 


f.  Although  there  are  many  other  models  that 
are  used  for  variograms  (Joumel  and  Huijbregts 
1978),  these  four  are  the  most  commonly  used  and 
are  shown  in  Figure  2-3.  The  exponential,  spheri¬ 
cal,  and  Gaussian  models  are  similar  in  that  they 
all  have  a  sill  and  a  range.  However,  they  have 
different  shapes  near  zero  lag  (/?=0)  that,  as  will  be 
discussed  in  Chapter  4,  result  in  significant  differ¬ 
ences  in  the  prediction  results  using  the  three 
models.  The  linear  model  is  quite  different  from 
the  other  three,  in  that  it  does  not  reach  a  sill,  but 
increases  linearly  without.  This  fact  will  have 
important  implications  on  the  prediction  results 
using  a  linear  variogram.  Because  the  squared 
differences  between  residuals  tend  to  increase 


without  bound  as  the  lag  increases,  a  regionalized 
random  variable  with  a  linear  variogram  will  have 
ever-increasing  variability  about  its  mean  as  the 
size  of  the  sampling  region  is  increased.  In  appli¬ 
cations  involving  the  linear  variogram,  the  vario¬ 
gram  is  usually  truncated  at  a  sill  corresponding  to 
the  value  of  the  variogram  at  maximum  lag 

g.  Before  closing  this  section,  it  will  be  use¬ 
ful  to  highlight  some  similarities  and  contrasts 
between  the  covariance  function  and  the  vario¬ 
gram.  Although  the  variogram  is  commonly  used 
in  a  geostatistical  analysis,  it  is  sometimes  easier  to 
gain  an  intuitive  understanding  of  the  methodology 
using  the  covariance  function,  or  equivalently,  the 
spatial  variance  and  the  correlation  function. 

^en  ZQc)  has  a  stationary,  isotropic  covariance 
function  (Equation  2-9),  there  is  a  one-to-one 
correspondence  between  the  variogram  and  the 
covariance  fimction,  namely 

Y  (A)  =  C  (0)  -  C  (h)  (2-27) 

As  long  as  C(h)  approaches  zero  as  h  increases  (a 
minor  technicality  that  can  always  be  assumed  in 
practice),  then,  as  indicated  by  Equation  2-27,  the 
variogram  reaches  a  sill  and  the  sill  equals  C(0). 
Therefore,  when  dealing  with  a  covariance¬ 
stationary  regionalized  random  variable,  the  vario¬ 
gram  and  the  spatial  covariance  function  contain 
the  same  information  as  one  another.  By  factoring 
out  C(0)=5  from  Equation  2-27  and  using  Equa¬ 
tion  2-14,  the  relationship  between  the  spatial 
correlation  function  and  the  variogram  can  be 
obtained 

p  (A)  =  1  -  XW  (2-28) 

s 

From  Equation  2-28,  it  is  evident  that  high  values 
of  y(h)  (i.e.,  close  to  s)  signify  low  values  of  p(h). 
In  fact,  p(/j)  =  0  whenever  y(h)  =  s,  indicating  that 
observations  whose  locations  are  farther  apart  than 
the  range  are  uncorrelated.  As  h  gets  small,  a 
nugget  in  y(h)  is  reflected  in  a  correlation  that  is 
less  than  1 
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Figure  2-3,  Theoretical  variograms  showing  A,  exponential;  B,  spherical;  C,  Gaussian;  and  D,  linear 
models 


p(A)-*l-.^as/!^0  (2-29) 

s 

Therefore,  the  larger  g  is  in  relation  to  s,  the  less 
correlated  nearby  observations  are.  The  case  when 
called  a  pure  nugget  variogram,  results  in 
p(A)=0  for  all  h>0.  In  tiiat  case,  neighboring 
observations  are  uncorrelated  no  matter  how 
closely  they  are  spaced. 

h.  Occasionally,  y{h)  may  not  reach  a  finite 
sill,  as  in  the  linear  variogram  Equation  2-26.  In 
that  case,  it  is  not  possible  to  define  a  correlation 
function  as  in  Equation  2-28.  The  corresponding 
regionalized  random  variable  is  said  to  be  intrinsi¬ 
cally  stationary  (Joumel  and  Huijbregts  1978), 
which  is  more  general  than  covariance  stationarity. 
The  theory  behind  intrinsically  stationary  vario¬ 
grams  will  not  be  presented  in  this  ETL.  As  long 
as  a  “pseudo-range”  is  defined,  all  of  the 
computations  described  below  can  be  generalized. 


2-4.  Kriging 

a.  General 

(1)  Given  a  regionalized  random  variable  Z(x) 
with  a  known  theoretical  variogram,  the  question 
is:  how  can  the  value  of  Z(^  be  predicted  at  an 
arbitraiy  location,  based  on  measurements  taken  at 
other  locations?  Suppose  that  Z  is  measured  at  n 
specified  locations:  Z(ii), Z(i„).  For  example, 
Z  could  correspond  to  hydraulic  conductivity  and 
the  locations  might  correspond  to  n  preexisting 
wells  in  an  aquifer.  Let  a  new  location  be  given  by 
^=(wo,Vo)  and  denote  the  ith  measurement  location 
by  xr (w/,  v,).  Suppose  that,  based  on  prior  knowl¬ 
edge  of  the  geology,  there  are  no  prevailing  trends 
in  hydraulic  conductivity,  so  the  mean  of  Z(^  is 
assumed  to  be  constant  over  the  entire  region: 

p  (21)  =  ^  (constant)  (2-30) 
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(2)  Suppose  the  investigator  wants  to  predict 
fee  value  of  Zfx^)  by  using  a  linear  predictor, 

Z  (xq),  which  is  defined  as  a  weighted  linear  combi¬ 
nation  of  fee  measured  data 


2  (V  =  E  (i,) 


(2-31) 


i  = : 


where  w,  is  fee  weight  assigned  to  Z(x^.  To  deter¬ 
mine  specific  values  for ^e  weights,  some  criteria 
need  to  be  specified  for  Z  (j&))  to  be  a  good  pre¬ 
dictor  of  Z(xq).  The  first  criterion  is  feat  Z  (io)  be 
an  unbiased  predictor  of  Z(^),  which  is  expres¬ 
sed  as 


E  [Z  -  Z  (i^)]  =  0 


(2-32) 


(3)  An  unbiased  predictor  will  neither  consis¬ 
tently  overpredict  nor  imderpredict  Z(xq)  because 
fee  statistical  expectation  of  the  prediction  errors  is 
zero.  The  second  criterion  for  a  good  predictor  is 
feat  it  have  small  prediction  variance,  defined  by 


-  E  [(z  (V  -  Z  (Vf] 

(4) ^The  smaller  the  prediction  variance,  fee 
closer  Z(Xo)  will  be  (on  average)  to  fee  true  value 
Z(xo).  The  geostatistical  method  of  kriging  deals 
wife  computing  fee  best  linear  unbiased  pre¬ 
dictor  of  Zfso),  which  is  fee  linear  unbiased  pre¬ 
dictor  (Equations  2-31  and  2-32)  wife  the  smallest 
possible  prediction  variance  (Equation  2-33). 

(5)  The  form  of  fee  best  linear  unbiased  pre¬ 
dictor  will  depend  on  fee  mean  of  Z(y.  For  exam¬ 
ple,  if  Z4)  has  a  constant  mean  (Equation  2-30) 
and  a  pure  nugget  variogram  [y(A)=s  for  all  /r>0], 
fee  best  linear  unbiased  predictor  of  ZQc^)  will 
simply  be  fee  average  of  fee  measured  data 


2  (V  =  -  E  2 


n  i  =  I 


(2-34) 


Because  fee  variogram  is  fee  same  for  all  h>0  and 
there  is  no  trend  in  fee  data,  there  is  no  reason  to 
favor  any  of  fee  measurements  over  any  of  fee 
other  measurements.  Therefore,  fee  weights  are  all 
fee  same.  Ordinary  kriging,  which  is  discussed  in 
section  2-4b,  deals  with  fee  constant-mean  model 
(assumption  in  Equation  2-30)  in  which  fee  vari¬ 
ogram  is  not  a  pure  nugget  variogram.  The 
weights  of  fee  test  linear  unbiased  predictor  will 
reflect  fee  information  in  fee  variogram  and  will 
result  in  an  improved  predictor  over  fee  sample 
mean.  In  section  2-4c,  universal  kriging,  which  is 
fee  extension  of  ordinary  kriging  to  fee  case  of  a 
nonconstant  mean,  is  discussed.  Universal  kriging 
is  a  very  powerful  tool  that  can  be  used  to  combine 
regression  models  and  spatial  prediction  into  one 
unifying  theory.  Other,  more  specialized  types  of 
kriging  feat  will  be  discussed  in  this  section  are 
indicator  kriging  (section  2-6c),  block  kriging  (sec¬ 
tion  2-4d),  and  co-kriging  (section  2-5). 

(6)  Before  giving  fee  kriging  equations,  one 
final  note  is  in  order.  There  is  a  prediction  tech¬ 
nique  in  geostatistics  known  as  simple  kriging, 
which  deals  wife  best  linear  unbiased  prediction  in 
fee  case  when  fee  mean  of  Z(^  is  fixed  and  known. 
Simple  kriging  is  not  discussed  in  this  ETL, 
because,  in  most  applications,  the  mean  is  not 
known  and  has  to  be  estimated. 

b.  Ordinary  kriging. 

(1)  General. 

(a)  Let  Z(ybe  a  regionalized  random  variable 
wife  constant  mean  (Equation  2-30)  and  isotropic 
variogram  (Equation  2-20).  Also,  assume  feat  fee 
variogram  reaches  a  sill  so  feat  fee  variance  of 
Z(y  is  C(0)=5,  and  fee  correlation  function  is 
given  by  Equation  2-28.  Although  fee  prediction 
equations  can  be  expressed  in  terms  of  fee  vario¬ 
gram,  they  will  be  defined  here  in  terms  of  fee  sill 
(variance)  and  fee  correlation  function. 

(b)  Consider  linear  unbiased  predictors  of  fee 
form  of  Equation  2-31  wife  fee  condition  in  Equa¬ 
tion  2-32  holding.  The  unbiased  condition  is 
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equivalent  to  5^  w.  =  n  for  any  jr,  which  holds 

i  =  1 
n 

if  and  only  if  w.  =  1 .  Therefore,  all  linear 

i  =  1 

unbiased  estimators  need  to  have  weights  that  sum 
to  one.  There  are  many  sets  of  weights  diat  satisfy 
this  condition,  including  the  set  with  all  the  weights 
equal  to  l/n,  as  in  the  sample  mean.  Equation 
2-34.  However,  the  unique  set  of  weights  that 
minimize  the  prediction  variance  (Equation  2-33) 
can  be  shown  to  satisfy  the  following  set  of  n+1 
ordinary  kriging  equations  (Chapter  12,  Isaaks 
and  Srivastava  (1989)): 

i  =  PiO’  i=h2...,n,  (2-35a) 

y=i  s 

^  w.=  l  (2-35b) 


on  «=2  measmements  Z(xi)  and  Z(x2),  where  the 
three  locations  (xq,  and  ^2)  are  distinct.  Using 
Equations  2-23  and  2-28,  note  that  the  correlation 
function  is 

( 1  -  — ]  expf  -3—],  h>0 
p(h)  =  ^  j  I  ■  (2-37) 

1,  h=0 

For  illustrative  purposes,  suppose  that 

—  —  p,  0  <  p  ^  1  (2-38) 

s 

where  pis  a  fixed  proportion.  The  quantity  p  is 
sometimes  referred  to  as  a  relative  nugget. 

(b)  The  ordinary  kriging  Equations  2-35a  and 
2-35b  are  given  by 


where  p,y  =  p(A^)  is  the  correlation  between  obser¬ 
vations  i  and y,  hy  is  the  distance  between  locations 
i  and  j,  and  A,  is  a  coefficient  resulting  from  the 
constrained  optimization.  Furthermore,  the 
resulting  ordinary  krig^g  variance  is 

al  (i)o  =  E  [fe  (i^)  -  Z  ixj] 


X 

Wi  +  >^2  P12  +  -  =  Pio 

s 

X 

^iPn  +  ^^2  +  -  =  P20 
s 


(2-39a) 

(2-39b) 


Wj  +  W2  =  1  (2-39c) 

These  three  equations  have  three  luiknowns:  Wj, 
W2,  and  A;  the  solution  is 


can  easily  be  solved  for  the  w/s  and  X,  after  which 
the  kriging  variance  can  be  obtained  from  Equa¬ 
tion  2-36.  Note  that  the  ordinary  kriging  variance 
changes  depending  on  the  prediction  location  Xq, 
even  though  the  variance  of  Z(^)  itself  (Equa¬ 
tion  2-6)  is  constant  for  all 

1  1  Pio  P20 

w,  =  —  + - 

2  2  1  -  p,2 

1  1  Pio  “  P20 

Wo  =  — - 

2  2  1  -  p,2 

(2)  Example  1. 

(a)  Let  the  mean  of  Z(^  satisfy  Equation  2-30, 

and 

and  suppose  that  the  residual  Z*(j0  (Equa¬ 
tion  2-16)  has  an  isotropic  exponential  variogram 

^  ~  ~2  ^ 

(Equation  2-23).  Consider  predicting  ZQ^)  based 

(2-40a) 

(2-40b) 


(2-41) 


2-11 


ETL  1110-1-175 
30  Jun  97 

The  resulting  kriging  variance  is 


(4 


(2-42) 

T  "  ’^iPlO  ”  ^2^20  ~  T  (PlO  P20  "  Pn)] 


Although  there  are  only  fliree  sample  locations  in 
this  example  (two  actual  and  one  potential),  it  indi¬ 
cates  several  properties  of  best  linear  unbiased  pre¬ 
diction  that  hold  in  general.  For  example, 


(c)  Effect  of  sill.  The  kriging  weights 
depend  on  s  only  through  the  relative  nugget  p. 
However,  the  kriging  variance  is  directly  propor¬ 
tional  to  s.  The  sill  is  called  a  scaling  parameter 
because  scaling  each  measurement  by  a  constant  c 
has  the  effect  of  scaling  s  by  (?.  When  the  relative 
nugget  is  allowed  to  vary  so  that  j  and  g  can 
change  independently,  the  effect  of  s  is  somewhat 
more  complicated. 


(d)  Effect  of  nugget.  Increasing  has  the 
effect  of  drawing  each  of  the  weights  closer  to  1/2. 
In  fact,  as  p  approaches  1,  both  weights  will  equal 
1/2.  The  larger  g  is  in  relation  to  s,  die  more 
small-scale  variability  there  is  in  the  data  and  the 
less  important  the  correlation  between  neighboring 
locations  becomes.  The  increased  small-scale 
variability  also  causes  an  increase  in  the  kriging 
variance. 


(e)  Effect  of  correlations.  If  Z(j^)  is  more 
highly  correlated  with  Z(xj)  than  with  2(^2),  then 
w,  will  be  larger  than  W2,  indicating  that  the  mea¬ 
surement  at  the  first  location  has  more  predictive 
information  than  the  measurement  at  the  second 
location.  Also,  correlation  in  the  data  always 
decreases  the  kriging  variance  compared  to  the 
variance  with  imcorrelated  data. 


(f)  Effect  of  data  clamping.  IfZ(x,)and 
ZQC2)  are  highly  correlated,  as  indicated  by  p|2 
being  close  to  1,  then  the  two  measurements  con¬ 
tain  much  of  the  same  information.  Two  situations 
can  occur;  pi^  =  P20,  in  which  case  tire  weights  are 
both  equal,  or  p,o  >  P20  [Pio  <  P20])  which  case 
w,  will  be  much  larger  [smaller]  than  W2.  In  either 


case,  the  kriging  variance  will  increase  to  reflect 
the  redundant  information  in  the  two  measure¬ 
ments.  Automatic  adjustment  of  the  kriging 
weights  and  kriging  variance  to  account  for  data 
clumping  is  an  important  property  of  the  kriging 
predictor. 

(3)  Example  2  (Nugget  effect  versus  measure¬ 
ment  error). 

(a)  In  example  1,  all  three  locations  Xq,  x„  and 
£2,  were  assmned  to  be  distinct.  When  a  prediction 
location  happens  to  coincide  with  a  measurement 
location,  there  is  an  important  distinction  that 
needs  to  be  made  between  a  true  nugget  effect  and 
a  measurement  error.  Suppose  that  in  example  1, 

Xg  andx,  are  the  same.  Iftoere  is  only  small-scale 
variability,  but  no  measurement  error,  then 
repeated  measurements  at  the  same  location  should 
be  identical,  that  is,  Pio  =  1-  hi  this  case,  the  krig¬ 
ing  equations  result  in  w,  =  1 ,  W2  =  0,  and  A  =  0 
and  in  a  kriging  variance  of  zero.  That  is,  Z(x,)  is 
a  perfect  predictor  of  Z(xo).  This  property,  called 
exact  interpolation,  is  a  property  of  kriging  when 
the  data  are  assumed  to  contain  no  measurement 
errors.  However,  suppose  instead  that  the  nugget 
is  interpreted  as  measurement  error  rather  than 
small-scale  variability.  In  that  case,  repeated 
measurements  at  the  same  location  would  not  be 
perfectly  correlated,  but  rather,  p,o  =  1-g/s. 

(b)  Substituting  this  correlation  into  the  krig¬ 
ing  equations  and  solving  the  equations  results  in  a 
predictor  that  does  not  exactly  interpolate  the  data, 
but  instead  smooths  the  measured  data  to  account 
for  the  measurement  error.  In  this  ETL,  prediction 
locations  are  assumed  not  to  coincide  with  mea¬ 
surement  locations,  in  which  case  no  distinction 
needs  to  be  made  between  nugget  and  measurement 
error. 

c.  Universal  kriging. 

(1)  Universal  kriging  is  an  extension  of  ordi¬ 
nary  kriging,  tiiat,  due  to  the  fact  that  environ¬ 
mental  data  often  contain  drift,  can  be  important  in 
HTRW  site  investigations.  Universal  kriging 
addresses  the  case  of  a  nonconstant  mean  |j(x). 
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Generally,  the  mean  is  assumed  to  have  a  func¬ 
tional  dependence  on  spatial  location  of  the  form 

11  («.  v)  =  53  ^jfj  («,  v)  (2-43) 

>  =  1 


E  ^jPij  ^  -  E 

s  k^\ 


>  =  1 

~  P|0’  •••9  ^ 


(2-46a) 


where  the are  known  deterministic  func¬ 
tions  of^(u,v)  (that  is,  these  functions  serve  as 
independent  variables)  and  the  P^’s  are  regression 
coefficients  to  be  estimated  from  the  data.  For 
example,  suppose  Z(x)  is  hydraulic  head  in  an 
aquifer.  If  the  flow  is  in  a  steady  state,  it  might  be 
reasonable  to  assume,  in  a  given  case,  that  the 
mean  of  Z(x)  has  a  unidirectional  groundwater 
gradient  that  is  given  by 

p  (m,  v)  =  P,  +  P2  m  (2-44) 

Li  this  example,  there  are  two  independent 
variables: 


E  ^jfk  (-0 

y  =  1 


0C[))>  ^  >  1»  2,  ...,  p 


(2-46b) 


where,  in  contrast  to  the  ordinary  kriging  equa¬ 
tions  (2-35a  and  b),  there  are  now  p  coefficients 
2,1, ...,  Ap  resulting  from  the  unbiased  condition  on 
the  predictor.  The  first  term  in  the  mean  (Equa¬ 
tion  2-43)  will  usually  be  a  constant,  or  intercept, 
for  which/,  (i)  =  1 .  Therefore,  the  universal  krig¬ 
ing  model  includes  ordinary  kriging  as  a  special 
case.  The  universal  kriging  variance  is  given  by 


/,  (m.  v)  =  1 

/  (m,  v)  =  m 


(2-45) 


and  two  regression  coefficients  (P,  and  P2).  The 
mean  can  include  other  independent  variables 
besides  simple  algebraic  functions  of  u  and  v.  For 
example,  if  the  aquifer  is  not  of  uniform  thickness, 
an  independent  variable  diat  involves  the  aquifer 
thickness  at  location  (u,v)  could  be  included. 

(2)  The  form  assumed  for  the  mean  in  Equa¬ 
tion  2-43  is  also  generally  used  in  standard  linear 
regression  analysis.  In  regression,  ordinary  least- 
squares  is  generally  used  to  solve  for  the  coeffi¬ 
cients;  when  this  is  done,  it  is  assumed  that  the 
residuals  are  independent  and  identically  distribu¬ 
ted.  Universal  kriging  is  an  extension  of  ordinary 
least-squares  regression  friat  allows  for  spatially 
correlated  residuals.  Assuming  that  ZQc)  is  a 
regionalized  random  variable  with  a  mean  as  in 
Equation  2-43  and  residual  correlation  function  as 
in  Equation  2-28,  the  best  linear  unbiased  predictor 
(Equation  2-10)  can  be  obtained  from  the  follow¬ 
ing  n+p  equations,  called  the  universal  kriging 
equations  (Joiunel  and  Huijbregts  1978): 


5  1 


-  E 


-  E  Kfk(\) 

k=  1 


(2-47) 


These  equations  can  be  easily  solved  to  obtain 
universal  kriging  predictors  and  kriging  variances 
for  any  desired  location.  The  estimated  trend 
surface  does  not  actually  need  to  be  computed  to 
obtain  the  universal  kriging  predictor.  If  a  particu¬ 
lar  application  needs  an  estimate  of  the  trend  sur¬ 
face,  then  generalized  least-squares  regression  can 
be  used  to  estimate  the  coefficients  (p^’s)  in  the 
regression  equation. 

d.  Block  kriging, 

(1)  Up  to  this  point,  the  problem  of  predicting 
the  value  of  a  regionalized  random  variable  at  a 
given  location  in  the  region  over  which  the  variable 
is  defined  has  been  considered.  Implicit  in  this 
analysis  is  the  assumption  that  the  support  of  the 
variable  being  predicted  is  defined  in  exactly  the 
same  way  as  the  variables  that  make  up  the  mea¬ 
surements.  However,  there  may  be  applications 
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where  it  is  necessary  to  estimate  flie  average  value 
of  Z  over  an  estimation  block  of  much  larger  area 
than  is  represented  by  an  individual  sample.  For 
example,  an  estimate  of  the  average  concentration 
of  a  contaminant  over  an  entire  aquifer  based  on 
point  measurements  at  various  locations  might  be 
needed.  In  other  applications,  an  estimate  of  the 
average  concentration  of  soil  contaminant  in  daily 
excavation  volumes  that  are  much  larger  than  the 
volume  of  an  individual  sample  may  be  needed. 
Let  Zg  be  the  average  value  of  Z(x)  over  a  particu¬ 
lar  block  5, 

Z.  -  -  E  Z  K,) 

/w  ,•  =  1 


kriging  variance  is  not  as  simple,  because  the 
individual  kriging  estimates  are  not  independent  of 
one  another.  There  are  simple  modifications  to  the 
kriging  equations  discussed  in  sections  2-4b  and 
2-4c  that  can  be  used  to  directly  compute  the  krig¬ 
ing  estimate  of  Z^,  along  with  its  kriging  variance 
(Chapter  13,  Isaaks  and  Srivastava  (1989)).  The 
equations  are  not  presented  in  this  ETL.  The  com¬ 
puter  packages  described  in  the  next  section  can  be 
used  to  compute  block  kriging  estimates.  In  gen¬ 
eral,  kriged  values  of  block  averages  are  less 
variable  than  kriged  values  at  single  locations. 
Consequently,  the  blocked  kriging  variance  tends 
to  be  smaller  than  the  kriging  variance  at  a  single 
location. 


where  2o„  i=l,. denotes  m  prediction  locations 
in  block  B.  The  object  is  to  predict  this  average 
rather  than  the  regionalized  variable  at  a  single 
location.  In  many  applications,  the  locations  % 
might  correspond  to  nodes  of  a  regular  grid  or 
finite-  element  nodes  in  a  groundwater  model. 
Results  of  the  block  kriging  are  dependent  on  m 
and  on  the  placement  of  the  prediction  locations. 
Selecting  a  large  number  of  locations  in  block  B, 
where  each  location  has  approximately  the  same 
representative  area,  is  the  best  approach  (Chap¬ 
ter  13,  Isaaks  and  Srivastava  (1989). 

(2)  The  objective  ofblock  kriging  is  to  obtain 
the  best  linear  unbiased  predictor  of  Z^  and  an 
estimate  of  the  block  kriging  variance  based  on  the 
measurements.  The  model  for  Z(x)  can  be  the 
constant-mean  model  (Equation  2-30)  assumed  for 
ordinary  kriging  or  the  more  general  linear  regres¬ 
sion  model  (Equation  2-43)  assiuned  for  universal 
kriging.  In  either  case,  the  predicted  value  of  Zg 
coincides  with  the  average  of  the  predicted  values 
of  the  individual  measurements  in  the  block;  that  is 


(2-49) 


In  this  equation,  the  individual  predicted  values  are 
obtained  from  either  the  ordinary  or  universal  krig¬ 
ing  equations.  However,  computation  of  the  block 


2-5.  Co-kriging 

a.  Kriging  as  discussed  so  far  provides  a  way 
of  predicting  values  of  a  regionalized  variable  ZQc) 
at  a  location  ^  based  on  measurements  of  the  same 
variable  at  locations  In  some  situa¬ 

tions,  however,  there  will  be  available  measure¬ 
ments  not  only  of  Z(y ,  but  also  of  one  or  more 
other  variables  that  can  be  used  to  improve  predic¬ 
tions  of  Z(ro).  The  variable  Z(^  will  be  called  the 
primary  variable,  because  it  is  the  one  to  be  pre¬ 
dicted,  and  the  other  variables  will  be  called 
secondary  variables.  Co-kriging  is  the  technique 
that  allows  the  use  of  the  information  contained  in 
secondary  variables  in  the  prediction  of  a  primary 
variable.  As  an  example,  suppose  that  Z(3^  is  a 
regionalized  variable  representing  the  hexavalent 
chromium  concentration,  a  relatively  difficult 
determination,  and  that  hexavalent  chromium  con¬ 
centration  needs  to  be  predicted  at  a  location  ^ 
based  on  measurements  of  hexavalent  chromium  at 
other  locations,  but  there  are  also  measurements  of 
a  second  relatively  easily  determined  contaminant, 
for  example  lead,  that  tend  to  be  correlated  with 
hexavalent  chromium  concentration  and  these  data 
are  to  be  used  as  well.  Denote  the  second  variable 
lead  by  a  regionalized  variable  and  assume 
that  measurements  have  been  made  on  IF  at  m 
locations  x'j  ^2^  ^  m-  The  co-kriging  predictor 

of  Z(^Q  is  then 
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4  =  E  z  (X) 

'  ‘  (2-50) 

+  E  w.  w  (2-; 

y  =  1 

This  is  a  straightforward  extension  of  the  kriging 
predictor  in  Equation  2-3 1 .  Analogous  to  kriging, 
co-kriging  produces  the  weights  w,  and  Wj  so  that 
the  resulting  predictor  is  the  best  linear  unbiased 
predictor.  Also,  as  with  kriging,  co-kriging 
requires  modeling  of  the  vaiiogram  for  Z,  but 
co-kriging  presents  the  investigator  with  the  addi¬ 
tional  necessity  of  modeling  the  variogram  of  W 
and  the  cross  variogram  for  Z  and  W.  The  opti¬ 
mal  weights  are  then  expressed  in  terms  of  all 
these  variogram  properties.  More  than  one  sec¬ 
ondary  variable  may  be  included  in  the  co-kriging 
predictor,  and  theory  has  been  developed  for 
co-kriging  in  the  presence  of  drift  (universal 
co-kriging)  and  block  co-kriging.  Details  are  not 
included  in  this  ETL,  but  the  interested  reader  may 
refer  to  Isaaks  and  Srivastava  (1989)  and  Deutsch 
and  Joumel  (1992)  for  more  discussion  and  cita¬ 
tion  of  other  references. 

b.  One  situation  in  which  co-kriging  might  be 
useful  is  when  the  primary  variable  is  undersam¬ 
pled,  so  any  additional  information,  such  as  that 
given  by  secondary  variables,  would  be  helpful. 
However,  although  co-kriging  can  be  a  useful  tool, 
joint  modeling  of  several  variables  tends  to  be 
demanding  in  terms  of  data  and  computational 
requirements.  Thus,  undersampling  of  the  primary 
variable  may  present  problems  for  co-kriging  as 
well  as  for  one-variable  kriging.  Also,  unless  the 
primary  variable  of  interest  is  highly  correlated 
with  the  secondary  variable(s),  the  weights 
assigned  to  the  secondary  variable(s)  are  often 
small,  with  the  result  that  the  effort  needed  to 
include  the  additional  variable(s)  may  not  be 
worthwhile.  For  these  reasons,  co-kriging  tends 
not  to  be  used  extensively  in  practice. 

c.  Although  co-kriging  is  similar  to  universal 
kriging,  in  that  both  techniques  use  extra  variables 
to  help  predict  Z(x),  there  is  an  important 
distinction  between  the  two  techniques.  In 


universal  kriging,  the  independent  variables  in 
Equation  2-43  need  to  be  known  with  certainty  at 
the  prediction  location  For  example,  aquifer 
thickness  might  be  an  independent  variable  in 
predicting  aquifer  head  if  it  can  easily  be 
determined  at  any  location.  However,  aquifer 
thickness  may  need  to  be  considered  a  secondary 
variable  in  a  co-kriging  procedure  if  it  is  only 
known  at  a  few  selected  points  in  the  aquifer. 

2-6.  Using  Kriging  to  Assess  Risk 

a.  General 

(1)  The  kriging  predictor  of  Z(xo)  has  certain 
desirable  properties  with  respect  to  how  close  it  is 
to  the  actual  value  of  Z(^),  it  is  unbiased  and  has 
smallest  variance  among  all  linear  predictors.  On 
the  average,  or  in  an  expected  sense,  the  predicted 
value  will  be  near  the  actual  value.  When  possi¬ 
ble,  however,  the  investigator  would  like  to  go  fur¬ 
ther  in  specifying  the  relationship  between  the 
predicted  and  observed  values.  Ideally,  the  investi¬ 
gator  would  like  to  make  probability  statements. 
For  example,  if  Z(^)  is  concentration  of  a  contam¬ 
inant,  the  investigator  might  like  to  be  95  percent 
certain  that  the  true  concentration  is  within 

0.05  ug/d  of  the  predicted  concentration.  In  other 
situations,  the  probability  that  the  actual  concen¬ 
tration  exceeds  a  given  target  value  might  need  to 
be  estimated.  Knowledge  of  the  entire  distribution 
function  of  Z(x),  as  opposed  to  knowledge  of  only 
its  mean  and  variogram,  can  be  used  for  risk- 
qualified  inferences  in  situations  when  extremes 
might  be  of  more  interest  than  averages. 

(2)  Introduction  of  the  concept  of  a  condi¬ 
tional  probability  distribution  function  of  the 
regionalized  variable  Z(x)  is  appropriate  at  this 
point.  This  concept  will  also  be  used  in  Chapter  7 
when  conditional  simulation  is  discussed.  The 
conditional  probability  distribution  function  has  a 
definition  much  like  that  of  the  probability  distri¬ 
bution  function  in  section  2-2,  except  the  proba¬ 
bility  that  Z(^  ^  c  is  computed  “conditional  on,” 
or  “given,”  information  at  other  spatial  locations. 
The  interest  in  geostatistics  is  to  make  predictions 
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at  a  location  using  information  at  measurement 
locations  j,,  X2, x„,  so,  in  terms  of  conditional 
distributions,  interest  focuses  onP[Z{x^  ^  c\Z 
(ii),  Z  (xj),  Z{x„)  ].  The  vertical  bar  denotes  the 

conditioning  and  is  read  “given.”  This  conditional 
probability  distribution  needs  to  be  determined  to 
make  probability  statements  about  the  regionalized 
variable  at  location  Xg.  Also,  conditional  mean 
and  conditional  variance  can  be  deiined  in  the 
present  context  in  the  same  way  that  mean  and 
variance  for  distribution  functions  were  defined  in 
section  2-2. 

(3)  Section  2-6b  contains  metiiods  for  using 
kriging  output  to  obtain  prediction  intervals  or 
quantiles  when  the  regionalized  random  variable  is 
either  normally  distributed  or  can  be  transformed 
to  a  near-normal  distribution.  Section  2-6c  dis¬ 
cusses  indicator  kriging,  which  is  a  nonparametric 
method  for  obtaining  quantiles  when  data  cannot 
be  transformed  adequately  to  a  normal  distribution. 

b.  Normal  distributions  and  transformations. 

(1)  For  prediction  at  a  location  Xg,  a  kriging 
analysis  produces  the  predictor  Z  (xg)  and  the  asso¬ 
ciated  kriging  variance  a\  (i^) .  If  more  informa¬ 
tive  probability  assessments  are  to  be  made,  the 
ideal  situation  is  when  Z(x)  can  be  assumed  to  be  a 
Gaussian,  or  normal,  process,  which  means  that 
[Z(xi),...,  Z(x„)]  has  a  joint  normal  probability  dis¬ 
tribution  for  any  set  of  n  locations  and  any  value 
of  n.  In  this  case,  the  conditional  probability  dis¬ 
tribution  of  Z(xg)  given  the  n  observations  is  a  nor¬ 
mal  distribution  v^th  conditional  mean  equal  to  the 
kriging  predictor  Z  (xq)  and  conditional  variance 
equal  to  the  kriging  variance  (a^) .  This  normal 

distribution  can  be  used  to  obtain  a  prediction 
interval  for  Z(xg)  (conditional  on  the  measured 
data).  For  example,  from  a  table  of  the  normal 
distribution,  a  value  of  1.96  corresponding  to  a 
0.95  (two-sided)  probability  can  be  obtained. 

Then  the  assertion  that  there  is  a  95-percent  chance 
that  Z(xg)  will  be  in  the  95-gercent  prediction  inter¬ 
val  [Z(^)  - 1.96  o^  Ca^),  Z(£^)  +  1.96  o^OSg)] 
can  be  made.  Knowing  this  interval  is  much  more 
useful  than  simply  knowing  the  kriging  predictor 
and  variance. 


(2)  To  illustrate  quantile  estimation,  suppose 
that  contaminant  concentrations  are  being  studied 
and  the  concentration  that  has  only  a  1 -percent 
chance  of  being  exceeded  at  location  Xg  needs  to  be 
determined.  The  appropriate  (one-sided)  value 
fropi  a  normal  table  is  2.33,  so  the  desired  estimate 
is  Z  (2^)  +  2.33a^  (^). 

(3)  Even  if  Z(x)  is  not  Gaussian,  it  is  often 
possible  to  find  a  transformation,  Y(^=T(Z(^), 
such  that  y(x)  is  approximately  Gaussian.  When  a 
transformation  is  made,  the  kriging  analysis  is  per¬ 
formed  using  file  transformed  data  1%),  and  flie 
inverse  transformation  may  be  applied  to  obtain 
prediction  intervals  for  the  original  data.  For 
example,  the  most  common  transformation  is  the 
(natural)  logarithmic  transformation,  in  which 
TQr)=ln[ZQr)].  A  95-^ercent  prediction  intervaj^ 
for  Z(^  is  then  {exp  [T(xg)  - 1.96  a4(xg)],  exp  [7 
(xq)  +  1.96  o/xg)]}.  As  long  as  the  transformation 
is  a  one-to-one  function  such  as  a  logarithmic 
transform,  prediction  intervals  for  the  original  data 
can  be  obtained  by  simply  back-transforming  pre¬ 
diction  intervals  for  the  transformed  data. 

(4)  Although  it  is  a  simple  matter  to  obtain 
prediction  intervals  and  probabilities  using  simple 
back-transformation,  it  is  more  difficult  to  obtain  a 
predictor  of  the  untransformed  data  that  is  both 
unbiased  and  optimal  in  some  sense.  For  example, 
in  the  case  of  a  logarithmic  transformation,  a 
kriging  analysis  using  the  transformed  data  yields 
a  predictor  7  (xg),  which  is  the  best  linear  unbi¬ 
ased  predictor  of  7^(xg).  However,  the  back- 
transformed  value  Z  (xg)  =  exp  [Y  (xg)]  does  not 
possess  these  same  optimality  properties  as  a  pre¬ 
dictor  of  Yxo.  The  methodology  known  as  log¬ 
normal  kriging,  and  more  generally  trans-  normal 
kriging,  has  been  developed  to  obtain  predictors  in 
this  setting  (Joumel  and  Huijbregts  1978),  but 
because  of  the  complexity  involved  in  these  pro¬ 
cedures,  they  are  not  usually  used  by  practitioners. 
If  a  predicted  value  corresponding  to  Z(xg)  needs  to 
be  obtained  for  purposes  of  contour  plotting,  the 
kriging  predictions  7  (xg)  may  be  back-transformed 
and  plotted,  as  long  as  the  investigator  realizes  that 
such  values  do  not  have  the  usual  kriging  opti¬ 
mality  properties. 
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c.  Indicator  kriging. 


(1)  There  may  be  situations  when  a  transfor¬ 
mation  that  makes  Z(^  approximately  normal 
cannot  be  easily  determined.  In  such  situations, 
indicator  kriging  can  be  used  to  make  inferences 
about  the  probability  distribution  of  Z(^.  Because 
no  distributional  assumptions  are  made,  this  tech¬ 
nique  is  known  as  a  nonparametric  statistical 
procedure.  An  example  of  indicator  kriging  is 
included  in  Chapter  5,  and  a  paper  by  Joumel 
(1988)  is  a  good  reference  for  additional  informa¬ 
tion  about  indicator  kriging. 

(2)  To  perform  indicator  kriging,  a  special 
transformation,  known  as  an  indicator  transforma¬ 
tion,  is  applied  to  Z(^: 


I  (i,c)  = 


1,  Z(i)  ^  c 
0,  Z(i)  >  c 


(2-51) 


If,  as  in  the  usual  kriging  scenario,  the  data  set  at 
hand  consists  of  measurements  of  the  regionalized 
variable  Z(x)  at  n  locations,  c  needs  to  be  fixed 
first,  and  then  the  indicator  transformation  is 
applied  by  replacing  values  that  are  less  than  or 
equal  to  c  with  1  and  values  that  are  greater  than  c 
with  0.  The  variogram  and  kriging  analysis  is  then 
performed  using  these  O’s  and  1  ’s  rather  than  the 
raw  data. 


(3)  Kriging  predictors  using  the  indicator  data 
will  be  equal  to  &eir  observed  values  of  0  or  1  at 
the  measurement  locations  x,,  i=l,...,/i.  However, 
at  locations  different  &om  the  measurement  loca¬ 
tions,  predictions  may  be  between  0  and  1.  In 
interpreting  these  values,  the  power  of  indicator 
kriging  becomes  rpparent.  A  predicted  value  at  xo 


is  an  estimate  of  the  conditional  probability  distri¬ 
bution  P  [Z  (i^)  i  c|Z  (Sj)  Z  (s^),  ...,  Z  (s^)]. 
This  analysis  may  be  performed  for  a  range  of 
values  of  c,  and  by  doing  this  the  entire  distribution 
function  can  be  estimated.  This  estimate  of  the 
distribution  fimction  can  be  used  in  the  same  man¬ 
ner  discussed  above  to  obtain  prediction  intervals 
or  estimates  of  quantiles.  For  example,  to  estimate 
the  value  that  has  a  1 -percent  chance  of  being 
exceeded  at  location  Xq,  the  value  of  c  for  which  the 
kriged  indicator  prediction  is  0.99  at  that  location 
is  determined. 

(4)  One  advantage  of  indicator  kriging  is  that 
the  indicator  variogram  is  robust  with  respect  to 
extreme  outliers  in  the  data  because  no  matter  how 
large  (or  small)  Z(x)  is,  the  indicator  variable  is 
either  0  or  1 .  Indicator  variables  may  also  be  used 
in  the  context  of  block  kriging.  For  example,  a 
spatial  average  of  I(x,c)  over  a  block  B  equals  the 
fection  of  block  B  for  which  Z(x)  is  less  than  c. 
Another  advantage  of  indicator  kriging  is  that  it 
can  be  used  when  some  data  are  censored. 

(5)  Despite  the  relative  ease  of  implementa¬ 
tion,  there  are  several  drawbacks  to  indicator 
kriging,  and  investigators  may  wish  to  use  this 
technique  only  when  other  methods,  such  as 
normality  transformations,  produce  unacceptable 
results.  For  example,  the  kriged  values  of  I(x,c) 
may  be  less  than  0  or  larger  than  1.  Also,  the 
kriged  prediction  for  /(x,c,)  may  be  larger  than  the 
kriged  prediction  for  /fec2)  even  if  c,<C2,  which  is 
not  compatible  with  a  valid  probability  distribu¬ 
tion.  There  are  several  more  advanced  techniques 
for  dealing  with  fliese  problems  (Chapter  18, 

Isaaks  and  Srivastava  (1989);  however,  they  are 
beyond  the  scope  of  this  ETL. 
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Chapter  3 

Geostatistical  Resources  and  Tools 


Since  the  mid-1970’s,  a  myriad  of  texts  and  arti¬ 
cles  have  been  published  that  are  either  totally 
dedicated  to  geostatistical  methods  or  discuss 
geostatistics  in  detail.  Numerous  computer  pro¬ 
grams  and  software  packages  on  geostatistics  and 
kriging  accompany  many  of  these  texts.  Although 
only  a  few  of  tiiese  resources  will  be  briefly 
described  in  this  ETL,  their  lists  of  references  can 
provide  the  interested  reader  a  path  to  other  geo¬ 
statistical  topics  or  software  not  specifically 
covered  in  the  resources. 


3-1.  Texts  on  Geostatistics 

a.  The  geostatistical  texts  presented  in  this 
section  can  be  classified  into  two  broad  categories: 
instructional  texts  or  reference  texts.  For  one  who 
is  delving  into  geostatistics  for  the  first  time, 
Clark’s  (1979)  book  is  a  starting  point.  Simple 
explanations  of  fee  basic  kriging  techniques  are 
applied  to  an  example  data  set.  A  more  advanced 
treatment  of  fee  kriging  techniques  is  described  by 
Isaaks  and  Srivastava  (1989).  This  textbook  pre¬ 
sents  a  detailed  discussion  of  many  of  fee  back¬ 
ground  statistical  tools  and  concepts  needed  in 
geostatistical  applications,  including  histograms 
and  distributions  (univariate  and  bivariate), 
sampling,  correlation,  and  spatial  continuity.  The 
text  also  discusses  how  to  treat  fee  subtleties  of 
kriging  using  three  data  sets  as  examples.  As  well 
as  being  instmctional,  fee  book  also  can  be  used  as 
a  reference. 

b.  Texts  by  Cressie  (1991)  and  Joumel  and 
Huijbregts  (1978)  describe  fee  tools  of  geostatis¬ 
tics,  but  also  include  a  comprehensive  theoretical 
background  on  fee  techniques.  Cressie’s  (1991) 
text  is  a  treatment  of  spatial  processes  in  general 
and  reviews  a  wide  range  of  statistical  techniques 
in  fee  analysis  and  stochastic  modeling  of  spatial 
data.  There  is  a  four-chapter  section  on  geosta¬ 
tistics,  wife  a  complete  discussion  of  variogram 
estimation,  kriging  (including  universal  kriging). 


intrinsic  random  functions,  and  comparisons  of 
kriging  to  other  spatial  prediction  techniques.  The 
text  is  written  from  a  statistician’s  point  of  view 
and  is,  in  places,  written  at  a  fairly  high  level 
mathematically.  It  nevertheless  contains  numerous 
examples  and  illustrations  using  real-world  data. 
Joumel  and  Huijbregts  (1978)  maintain  a  mining- 
geological  perspective.  Two  other  texts  written  by 
statisticians  that  present  general  treatments  of  spa¬ 
tial  processes,  but  feat  lack  detailed  discussions  of 
kriging,  are  Cliff  and  Ord  (1981)  and  Ripley 
(1981). 

c.  David’s  (1977)  text  was  fee  first  extensive 
discussion  of  geostatistics  and  kriging  in  mining 
applications,  and  fee  discussion  is  presented  from  a 
practitioner’s  viewpoint.  Its  value  as  reference 
material  derives  from  fee  many  specific  mining 
applications  and  results.  A  broad  statistics  text 
wife  a  bent  toward  geological  applications  (Davis 
(1986),  serves  as  a  reference  for  standard  statisti¬ 
cal  procedures  needed  in  geological  ^plications  of 
geostatistics.  A  book  by  Bras  and  Rodriguez- 
Itrube  (1985)  that  discusses  a  range  of  techniques 
for  stochastic  modeling  in  fee  field  of  hydrology 
includes  a  chapter  on  applications  of  kriging. 

There  is  a  fairly  con[q)lete  mathematical  develop¬ 
ment  of  kriging  wife  details  of  an  application  to 
predict  mean  areal  precipitation.  In  a  paper  pre¬ 
pared  for  fee  U.S.  Environmental  Protection 
Agency,  Joumel  (1993)  discusses  geostatistics  as  it 
relates  to  environmental  science.  Finally,  Olea 
(1991)  presents  a  useful  glossary  of  geostatistical 
terms. 


3-2.  Useful  Journals 

The  journal  Mathematical  Geology  by  the  Inter¬ 
national  Association  for  Mathematical  Geologists 
reports  new  developments  in  fee  theory  and  appli¬ 
cation  of  kriging.  Although  many  of  fee  articles 
present  new  jq)plications  of  kriging  tools,  many 
also  are  dedicated  to  fee  derivation  of  statistical 
properties  of  fee  variogram,  kriging  estimation, 
and  cross-validation  results.  Journals  such  as 
Water  Resources  Research,  published  by  fee 
American  Geophysical  Union,  and  Groundwater, 
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published  by  the  Association  of  Groundwater 
Scientists  and  Engineers,  contain  articles  describ¬ 
ing  special  applications  of  kriging  techniques  in  the 
environmental  arena.  Water  Resources  Research 
tends  to  contain  articles  that  are  highly  theoretical. 
Other  journals  that  may  contain  information 
addressing  geostatistics  are  the  Journal  of  Envi¬ 
ronmental  Engineering,  published  by  the  Ameri¬ 
can  Society  of  Civil  Engineers;  Stochastic 
Hydrology  and  Hydraulics,  published  by  Springer 
International,  and  the  North  American  Council  on 
Geostatistics,  published  by  the  Colorado  School  of 
Mines. 


3-3.  Software 

a.  The  geostatistics  software  described  in  tiiis 
section  is  limited  to  a  few  readily  available  public 
domain  packages  that  are  executable  at  least  on  the 
DOS  and  sometimes  on  the  UNIX  platforms. 

There  are  several  commercial  paclmges  that  are 
being  marketed,  but  these  will  not  be  reviewed  in 
this  ETL.  It  is  beyond  the  scope  of  this  ETL  to 
acquire  and  evaluate  commercial  packages;  how¬ 
ever,  a  matrix-like  table  (Table  3-1)  has  been 
included.  The  table  addresses  each  of  the  software 
packages  described  in  this  ETL  and  also  may  serve 
as  a  reference  guide  to  other  software  packages. 

b.  Some  of  the  earliest  interactive  kriging 
software  offered  as  a  package  was  developed  by 
Grundy  and  Miesch  (1987).  Overall,  this  general 
statistics  package  (STATPAC)  contains  a  series  of 
programs  that  can  handle  two-dimensional  kriging, 
including  universal  kriging.  The  package  has 
capabilities  for  data  transformations,  variogram 
analyses,  cross-validation,  and  univariate  statistics 
(Table  3-1).  Graphics  in  the  package  are  limited 
to  simple  line-printer  plots  of  the  sample  vario¬ 
gram  points  and  data  maps.  The  menu-driven 
package  includes  a  tutorial  using  all  of  the  kriging 
routines.  The  package  is  distributed  with  not  all, 
but  most  source  codes  and,  therefore,  can  be 
modified  by  die  user  if  desired.  All  two- 
dimensional  kriging  routines  can  be  executed  from 
the  command  line,  which  provides  users  with  the 
opportunity  for  batch  processing. 


c.  The  geostatistical  environmental  assess¬ 
ment  software  know  as  GEO-EAS  (Englund  and 
Sparks  1991)  also  is  an  interactive,  menu-driven 
kriging  software  package  for  performing  two- 
dimensional  kriging.  It  has  no  direct  provisions  for 
universal  kriging  (Table  3-1).  GEO-EAS  does 
have  an  advantage  over  STATPAC  in  its  enhanced 
graphics  capabilities,  which  are  useful  in  the  inter¬ 
active  fitting  of  theoretical  variograms  to  sample 
variogram  points.  In  addition,  in  the  computation 
of  the  sample  variogram  points,  GEO-EAS  allows 
for  variable  bin  sizes,  the  use  of  which  will  be  fur¬ 
ther  discussed  in  Chapter  4. 

d.  STATPAC  and  GEO-EAS  were  originally 
developed  for  the  personal  computer.  Since  then, 
versions  of  GEO-EAS  have  been  developed  for 
some  types  of  work  stations.  The  kriging  routines 
in  STATPAC  have  not  been  adapted  to  work 
stations. 

e.  A  third  software  package,  the  geostatistical 
software  library  known  as  GSLBB  (Deutsch  and 
Joumel  1992),  is  a  suite  of  programs  developed 
over  the  years  at  Stanford  University,  Stanford, 
CA.  It  is  presented  as  a  collection  of  routines  that 
are  machine-independent  (Table  3-1)  and  are 
intended  to  be  used  as  a  modular  concept.  The 
package  is  distributed  as  a  suite  of  FORTRAN 
source  codes  that  need  to  be  compiled.  Use  of 
GSLIB  requires  a  relatively  high  level  of  familiar¬ 
ity  with  geostatistics  for  its  efficient  use.  As  in  the 
previous  two  software  packages,  GSLIB  handles 
variogram  analysis  and  kriging  techniques 
(Table  3-1).  Two  of  its  primary  advantages  over 
the  other  two  packages  are  its  simulation  tech¬ 
niques  and  ability  to  analyze  three-dimensional 
data  sets.  Such  techniques  are  useful  especially  in 
estimating  potential  extreme  outcomes  in  a  geosta¬ 
tistical  analysis. 

f.  The  Department  of  Defense  Groundwater 
Modeling  System  (GMS)  is  a  fourth  software 
package  that  has  kriging  capabilities.  GMS  is  a 
windows-based  integrated  modeling  environment 
for  site  characterization,  groundwater  flow  and 
transport  modeling,  and  visualization  of  results. 
The  GSLIB  software  has  been  implemented  within 
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GMS  to  facilitate  two-  and  three-dimensional 
kriging  and  interactive  variogram  modeling.  GMS 
also  provides  comprehensive  visualization  tech¬ 
niques  as  well  as  other  interpolation  techniques 
that  can  be  used  as  alternatives  to  kriging.  The 
GMS  system  was  developed  for  the  Department  of 
Defense  by  the  Brigham  Young  University  Engi¬ 
neering  Computer  Graphics  Laboratory.  GMS 
may  be  obtained  from  the  U.S.  Army  Groimdwater 
Modeling  Technical  Support  Center,  (U.S.  Army 
Engineer  Waterways  Experiment  Station,  Vicks¬ 
burg  MS  39180). 


g.  A  final  note  concerning  geostatistical  soft¬ 
ware  and  literature  is  that  there  can  be  differences 
in  jargon  or  notation.  These  differences  may 
cause  some  initial  confusion  if  users  or  readers  do 
not  pay  careful  attention  to  the  jargon  or  notation. 
For  example,  some  authors  may  wish  to  use  the 
term  “semi-variogram”  rather  than  “variogram”; 
others  may  express  random  variables  as  other  than 
Z  as  has  been  done  in  this  ETL,  and  it  is  common 
for  different  software  to  have  different  references 
for  directional  angles  when  discussing  anisotropy. 
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Table  3-1 

Geostatistical  Software  Characteristics 

Characteristic 

STATPAC 

GEO-EAS 

GSLIB 

GMS2.0 

Operating  system 

DOS 

DOX/UNIX 

Independent  (requires 
FORTRAN  compiler) 

WINDOWS  95  UNIX 

Menu-driven 

Yes 

Yes 

No 

Yes 

Batch  processing 

Yes 

No 

Yes 

Yes 

User  modifications 

Yes,  source  code 
provided 

No 

Yes,  source  code 
provided 

No 

Data-set  constraints 

Yes,  modifications 
possible  via 
source  code 

Yes 

Yes.  modification 
possible  via 
source  code 

Yes 

ASCII  output 

Yes 

Yes 

Yes 

Yes 

Univariate  statisitcs 

Yes 

Yes 

Yes 

Yes 

Additional  exploratory 

Yes 

Yes 

Yes 

Yes 

capabilities 

Graphical  support  for 

Yes 

Yes 

Yes 

Yes 

analysis 

Transformation 

Yes 

Yes 

Yes 

Yes 

Back-transformation 

No 

No 

Yes 

Yes 

Variogram  construction 

Yes 

Yes 

Yes 

Yes 

Variogram  analysis 

Yes 

Yes 

Yes 

Yes 

Variogram  graphics 

Yes 

Yes 

Yes 

Yes 

Cross-validation 

Yes 

Yes 

Yes 

Yes 

operations 

Ordinary  kriging 

Yes 

Yes 

Yes 

Yes 

Universal  kriging 

Yes 

No 

Yes 

Yes 

Block  kriging 

Yes 

Yes 

Yes 

Yes 

Indicator  kriging 

Yes 

Yes 

Yes 

No 

Conditional  simulation 

Perhaps  with  batch 
processing 

No 

Yes 

No 

Three-dimensional 

kriging 

Perhaps  with  batch 
processing 

No 

Yes 

Yes 

Mapping 

Yes 

Yes 

Yes 

Yes 

Contouring 

Yes 

Yes 

Yes 

Yes 

Gray-scale  maps 

Yes 

Yes 

Yes 

Yes 

Line  printer 

Yes 

No 

Yes 

Yes 

High-resolution  screen 

No 

Yes 

Yes  via  postscript 

Yes 

High-resolution  printer 

No 

Yes 

Yes 

Yes 

Postscript 

No 

No 

Yes 

Yes 
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Chapter  4 

Practical  Aspects  of  Variogram 
Construction  and  Interpretation 


4-1.  General 

a.  Chapter  2  presented  the  mathematical 
foundation  for  geostatistics  and  the  kriging  tech¬ 
nique.  One  theme  that  pervades  the  technique  is 
the  importance  of  the  theoretical  variogram.  The 
theoretical  variogram,  or  what  we  will  often  refer 
to  simply  as  the  variogram,  is  a  mathematical 
fimction  or  model  which  is  fitted  to  sample- 
variogram  points  obtained  firom  data.  Permissible 
models,  which  include  those  given  in  Chapter  2, 
belong  to  a  family  of  smooth  curves  having  par¬ 
ticular  mathematical  properties  and  are  each  speci¬ 
fied  by  a  set  of  parameters.  Chapter  4  will 
describe  a  sequence  of  stages  for  estimating  and 
investigating  sample  variogram  points  and  a  cali¬ 
bration  procedure  for  specifying  the  parameters  of 
the  variogram  model  eventually  fitted  to  the  sample 
points.  Although  the  calibration  procedure  is 
largely  an  objective  means  for  evaluating  theoreti¬ 
cal  variograms,  the  process  of  obtaining  sample 
variogram  points  and  finalizing  a  theoretical  vari¬ 
ogram  remains  an  art  as  much  as  a  science.  An 
understanding  of  the  material  presented  in  Chap¬ 
ter  2  as  well  as  professional  judgment  achieved 
through  experience  in  geostatistical  studies  is 
important  in  effectively  using  the  guidelines  pre¬ 
sented  in  this  section. 


b.  An  accurate  estimate  of  a  variogram  is 
needed  from  a  kriging  perspective  because  the  cor¬ 
relation  matrix  used  to  obtain  the  kriging  weights 
is  constructed  firom  the  variogram  values.  Even 
more  directly,  tiie  variogram  affects  the  computa¬ 
tion  of  the  kriging  variance  (Equations  2-36  and 
2-47)  through  the  product  of  the  kriging  weights 
and  variogram  values.  An  accurate  variogram  also 
has  utility  outside  the  strict  context  of  kriging.  For 
example,  in  augmenting  a  spatial  network  with  new 
data  collection  sites,  the  range  parameter  of  the 
variogram  could  be  used  as  the  minimum  distance 
of  separation  between  flie  new  sites  and  between 
new  and  existing  sites  to  maximize  overall 
additional  regional  information.  In  another  non- 
kriging-specific  application,  the  variogram  is  used 
in  dispersion  variance  computations  in  which  the 
variance  of  areal  or  block  values  is  estimated  from 
the  variance  of  point-data  values  (e.g.,  Isaaks  and 
Srivastava  (1989),  p.  480). 

c.  The  stages  of  variogram  construction  are 
described  using  an  example  data  set  of  ground- 
water  elevations  measured  near  Saratoga,  WY 
(Lenfest  1986),  that  are  summarized  in  Table  4-1 
and  whose  relative  locations  are  shown  in 
Figure  4-1. 

d.  The  sequence  of  steps  in  computing  sample 
variogram  points  depends  on  the  stationarity  prop¬ 
erties  of  the  regional  variable  represented  by  the 
data.  If  the  mean  of  the  regional  variable  is  tiie 
same  for  all  locations,  then  it  is  said  to  be  spatially 


Table  4-1 

Univariate  Statistics  for  Example  Data  Sets^ 


Example 

Identifier 

Number  of  Minimum 

Measurements  Transformation  (Base  units) 

Maximum 
(Base  units) 

Mean 

(Base  units) 

Median 
(Base  units) 

Standard 

Deviation  Skewness 
(Base  units)  (Dimensionless) 

Saratoga 

Drift 

2,016.6 

2.254.3 

2,119,25 

2,104.35 

56.79 

0.45 

Water  level  A 

83 

Drift 

25.6 

65.68 

42.30 

38.54 

10.13 

1.03 

Water  level  B 

74 

Drift 

25.6 

65.68 

42.85 

38.71 

10.59 

0.87 

Bedrock  A 

108 

None 

22.64 

80.48 

44.42 

42.82 

10.76 

0.89 

Bedrock  B 

89 

None 

24.53 

69.22 

43.67 

43.17 

8.58 

0.26 

Water  quality  A 

66 

Natural  log 

2.08 

8.01 

5.19 

5.59 

1.75 

-0.42 

^Base  unit  for  Saratoga,  water  levels,  A  and  B,  and  Bedrock  A  and  B  is  feet;  base  unit  for  water  quality  A  is  log  concentration, 
concentration  in  micrograms  per  liter. 
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Figure  4-1,  Measured  water  levels  from  Saratoga 
data 

stationary;  if  the  mean  changes  with  location,  then 
it  is  spatially  nonstationary.  Generally,  if  the  data 
have  a  stationary  spatial  mean,  the  discussions  in 
sections  4-3  and  4-7,  which  address  nonstation- 
arity  and  additional  trend  considerations,  can  be 
omitted.  If  the  spatial  mean  is  not  stationary,  as 
for  this  example  data  set,  then  sections  4-3  and  4-7 
become  important,  and  the  sequence  of  stages  for 
obtaining  a  variogram  becomes  an  iterative  pro¬ 
cedure.  All  variogram  and  kriging  computations 
for  the  Saratoga  groundwater  levels  example  were 


performed  by  the  interactive  kriging  software 
described  in  Grundy  and  Miesch  (1987). 

4-2.  General  Computation  of  Empirical 
Variogram 

a.  As  described  in  section  2-3,  the  variogram 
y{h)  characterizes  the  spatial  continuity  of  a 
regional  variable  for  pairs  of  locations  as  a  func¬ 
tion  of  distance  or  lag  h  between  the  locations. 

This  variogram  is  sometimes  called  the  theoretical 
variogram  because  it  is  assigned  a  continuous 
functional  form  that  expresses  the  spatial  correla¬ 
tion  for  any  lag  in  the  region  of  analysis.  The 
function  is  estimated  by  fitting  one  of  the  equations 
given  in  section  2-3  to  empirical  or  sample  vario¬ 
gram  points  y(A)  using  data  whose  locations  con¬ 
tribute  only  a  finite  number  of  lags.  Although  %h) 
characterizes  the  spatial  correlation  of  the  data,  it 
is  computed  from  residuals  of  the  data  off  the  spa¬ 
tial  mean.  Therefore,  without  prior  knowledge  of 
nonstationarity  in  the  underlying  spatial  process, 
the  first  step  in  computing  the  sample  variogram  is 
to  identify  existing  nonstationarity  indicated  for  the 
spatial  mean. 

Z>.  The  approximation  to  Equation  2-19 
begins  by  computing  squared  differences  from 
the  data  values  z(Xj),  z(x2),  ...z(xj  collected  at  loca¬ 
tions  X/,  ...  2^ 

Dfj  =  [z  (i)  -  z  (4-1) 

If  the  spatial  mean  is  stationary,  then  the  squared 
differences  of  the  data  are  equivalent  to  the 
squared  differences  of  the  residuals,  and  sample 
variogram  computations  can  be  continued  using 
the  data  themselves.  If  the  spatial  mean  is  strongly 
nonstationary,  the  plot  of  Equation  4-1  versus  the 
distance  between  associated  points  may  indicate  a 
trend  or  drift  that  would  need  to  be  removed  before 
further  variogram  computations  could  be  made. 
Drift  would  have  to  be  considered  in  HTRW 
studies,  such  as  deterniining  contaminant  concen¬ 
trations  areally  dispersed  from  localized  sources  or 
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determining  groundwater  elevations  following  a 
local  or  regional  gradient.  In  such  studies  sample 
variogram  computations  need  to  be  made  using 
residuals  obtained  by  subtracting  the  estimated 
drift  value  at  each  location  firom  the  value  of  the 
datum  at  the  location. 

c.  The  data  in  Equation  4-1  are  differenced 

without  considering  the  relative  direction  between 
the  locations;  that  is,  is  isotropically  com¬ 
puted.  A  plot  of  Dy  versus  Ay  for  all  i,j  (i>j), 
where  =  1  -  i.  | ,  produces  a  cloud  of 

points  whose  properties  govern  the  behavior  ofy. 
The  central  tendency  of  the  cloud  would  generally 
increase  with  A.  A  substantial  increase  in  the 
central  tendency  that  persists  for  large  h  can  indi¬ 
cate  a  nonstationary  spatial  mean.  The  cloud  com¬ 
puted  for  the  Saratoga  data,  with  groundwater 
levels  (z)  in  meters  and  distance  (A)  in  kilometers, 
is  shown  in  Figure  4-2  and  does  show  increasing 

with  increasing  A,  indicating  potential 
non-stationarity. 

d.  Generally,  there  is  a  large  amount  of  scat¬ 
ter  in  these  plots,  as  seen  in  Figure  4-2,  and  this 
scatter  can  conceal  the  central  behavior  of  with 
A.  One  way  to  estimate  the  central  tendency  and  to 
minimize  the  effect  of  aberrant  data  values  is  to 
collect  the  IP  into  K  bins  or  lag  intervals  of  width 
(AA^i  ,k=\,...K  and  assign  to  y  the  average  of  die 
values  of  Z)^  in  each  bin.  This  process  is  similar  to 
the  way  data  are  placed  in  bins  for  obtaining  histo¬ 
grams.  The  expression  for  the  kth  average  bin 
value  is 

Y  =  — - —  E  ^i4  (Kj)  (4-2) 

where  A^fA*)  is  the  number  of  squared  differences 
that  fall  into  bin  k,  and  A^  is  the  lag  distance  asso¬ 
ciated  with  bin  k.  IkfhiJ  is  an  “indicator  function” 
that  has  a  value  of  one  if  the  A, ,  falls  into  bin  k  and 
zero  otherwise  (it  only  includes  values  of  Dy  in 
the  calculation  that  have  an  Ay  that  falls  into  the 
bin).  The  lag  value  A*  can  be  the  midpoint  of  the 
bin  or  it  can  be  the  average  of  the  actual  lag  values 
for  the  points  that  fall  in  the  bin. 


e.  To  establish  bins,  either  equal  bin  widths 
are  specified  and  the  distance  between  the  two 
most  separated  data  points,  A^^i,  is  subdivided 
according  to  these  equal  increments,  or  a  AT  is 
chosen  that  defines  the  bin  width.  For  the  Sara¬ 
toga  data,  a  bin  width  of  about  8  km  established 
K=12  bins  for  y.  The  y  points  computed  from  the 
binned  £>y  values  of  Figure  4-2  are  shown  in  Fig¬ 
ure  4-3.  The  lag  plotting  positions  are  the  average 
A  values  in  the  bin.  The  symbol  x  indicates  that 
N(h)  is  less  than  30  pairs  for  the  particular  bin  and 
this  differentiation  will  be  discussed  in  section  4-3. 
Although  the  sample  variogram  is  still  preliminary, 
its  general  behavior  at  this  stage  is  adequate  to 
indicate  if  nonstationarity  needs  to  be  addressed 
before  sample  variogram  refinement  is  undertaken. 


4-3.  Nonstationarity 

a.  An  indication  of  substantial  nonstationarity 
or  drift  in  the  spatial  mean  would  be  a  parabolic 
shape  through  all  lags  in  a  plot  ofy.  This  shape 
occurs  because  differences  between  data  contain 
differences  in  the  drift  component  that  increase  as 
A  increases.  If  Equation  2-16  is  inserted  into 
Equation  2-17,  squaring  the  differences  in  p 
greatly  amplifies  the  increase  with  A.  In  these 
cases  of  drift,  generally  a  low-order  (less  than 
three)  polynomial  drift  in  (u,v)  is  fitted  to  the  data 
and  subsequently  subtracted  fi:om  the  data  to 
obtain  residuals.  Trend  surfaces  are  not  neces¬ 
sarily  limited  to  polynomial  forms.  For  example,  a 
numerical  model  of  groundwater  flow  may  be  used 
to  obtain  residuals  of  groundwater  head  data. 

b.  In  theory,  the  polynomial  trend  reflects  a 
slowly  varying  drift  in  the  spatial  mean  and,  as 
such,  one  regional  trend  surface  should  be  fitted  to 
all  the  data.  However,  often  the  drift  and  residuals 
are  obtained  locally;  that  is,  using  moving  neigh¬ 
borhoods  of  locations.  Estimates  of  these  values  at 
any  point  are  thus  made  using  a  reduced  number 
(usually  between  8  and  16)  of  surroimding  loca¬ 
tions.  This  is  done  because  ultimately  the  kriging 
estimates  are  made  using  only  tiie  data  values  in 
the  given  neighborhood.  Manipulating  the  kriging 


4-3 


•tfOlft  6uil09!pu 


ETL  1110-1-175 
30  Jun  97 


Figure  4-2.  Squared  differencing  of  values  for  aii  possibie  pairs  of  points  for  Saratoga  data 
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Figure  4-3.  Initial  sample  variogram  points  for  Saratoga  data 
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matrices  takes  less  time  when  a  smaller  number  of 
data  values  are  used  to  make  estimates  and  these 
efficiencies  can  be  significant  when  dealing  with 
large  data  sets.  Little  accuracy  is  lost  because  the 
nearest  neighbors  are  the  most  influential  in  the 
kriging  weighting  scheme. 

c.  A  parabolic  shape  to  y  for  the  Saratoga 
data  is  shown  in  Figure  4-3  for  the  sample  vario- 
gram  points  plotted  for  lags  up  to  about  32  km  (the 
first  four  points)  and  for  lags  beyond  about  56  km. 
The  presence  of  a  parabolic  shape  in  the  sample 
variogram  points  was  not  surprising,  because 
examination  of  die  data  indicates  a  north-south 
gradient  in  the  groundwater  levels.  The  simplest 
polynomial  trend,  linear  in  u  and  v,  was  fitted  to  all 
the  data  using  ordinary  least-squares  estimation. 
Residuals  obtained  by  subtracting  this  regional 
trend  surface  from  the  data  were  used  to  reestimate 
Y  in  Equation  4-2  and  the  sample  variogram  for  the 
residuals  is  shown  in  Figure  4-4. 

4-4.  Variogram  Refinement 

a.  In  the  previous  section,  an  initial  y  was 
specified  by  points  computed  from  Equation  4-2. 

In  general,  the  larger  Nfli,)  is  for  any  bin  or  lag 
interval  k,  the  more  reliable  will  be  the  points 
defining  y(hi).  Also,  die  larger  K  is,  the  greater  the 
number  of  sample  variogram  points  shaping  y. 
However,  N(hi)  and  K  are  competing  elements  of 
y.  Joumel  and  Huijbregts  (1978)  suggest  that 
each  lag  interval  k  should  have  N(hi)  equal  to  at 
least  30  pairs.  The  American  Society  for  Testing 
and  Materials  (Standard  D5922-96)  suggests 

20  pairs  for  each  lag  interval.  For  small  data  sets 
the  number  of  intervals  may  have  to  be  small  to 
guarantee  either  number  of  recommended  pairs  in 
all  intervals. 

b.  It  is  difficult  to  determine  the  minimum 
number  of  data  values  n  needed  to  satisfy  the  N(hJ 
requirements  for  all  lag  intervals  of  a  sample  vari¬ 
ogram.  Simple  combinatorial  analysis  can  estab¬ 
lish  a  sample  size  needed  to  achieve  a  given  total 
number  of  distinct  pairs  of  items  taken  fi'om  the 
sample,  but  it  does  not  address  the  spatial 


considerations  needed  for  proper  lagging.  As  an 
example,  for  data  collected  on  a  uniform  grid  and 
equal-sized  bins,  fixing  an  n  to  just  satisfy  die 
minimum  N(hi)  for  the  smaller  lags  will  yield 
insufficient  data  pairs  to  meet  the  minimum  N(h^ 
for  the  larger  lags.  Fixing  an  « to  assure  the  mini- 
mim  N(h^)  for  the  larger  lags  will  generally  have 
N(hi)  much  greater  than  the  minimum  for  the  smal¬ 
ler  lags.  Therefore,  the  question  of  how  much  data 
is  required  to  adequately  compute  a  variogram 
should  also  address  the  relative  locations  of  the 
data-collection  sites. 

c.  The  first  1 0  of  the  12  bins  for  y  for  the 
Saratoga  data  contained  more  than  30  data  pairs. 
Therefore,  the  bin  width  can  be  decreased  to  get 
more  points  defining  the  early  part  ofy.  These 
bin-width  adjustments  can  be  made  to  refine  y 
whether  it  is  computed  fi'om  the  data  or  fi'om  the 
residuals.  A  plot  of  y  for  the  residuals  for  the  Sar¬ 
atoga  groundwater  elevations  with  the  bin  width 
narrowed  to  about  6.5  km  is  shown  in  Figure  4-5. 

d.  Spatial  data  are  usually  not  collected  on  a 
uniform  grid  but  occur  in  a  pattern  that  reflects 
problem  areas,  accessibility,  and  general  spatial 
coverage.  In  the  Saratoga  data  set,  nonuniform 
data  spacing  results  in  the  number  of  data  pairs  in 
each  bin,  although  still  greater  than  30,  being 
highly  variable  among  the  bins.  This  variability 
jdelds  different  reliabilities  for  the  points  defining 
y.  To  establish  a  balance  for  N(hi)  among  the 
bins,  variable  bin  sizes  can  be  used  so  that  each 
bin  contains  approximately  the  same  (large)  num¬ 
ber  of  points.  A  bin  with  fewer  points  can  be 
coalesced  with  an  adjacent  bin  to  form  a  wider  bin 
wifii  a  greater  number  of  points.  Conversely,  a  bin 
with  an  excessive  number  of  points  can  be  sub¬ 
divided  into  adjacent,  narrower  bins.  The  coales¬ 
cing  and  subdividing  procedure  is  largely  trial  and 
error,  until  the  distribution  of  the  pairs  of  points  is 
satisfactory  to  the  investigator. 

e.  The  values  of  y  at  the  smaller  lag  values 
are  the  most  critical  to  define  the  appropriate  y. 
Therefore,  the  trade-off  between  the  number  of 
bins  and  the  number  of  data  pairs  within  each  bin 
can  be  varied  for  different  regions  of  the  sample 
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Figure  4-4.  Sample  variogram  points  for  ordinary-least  squares  trend  residuals  of  Saratoga  data 
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Figure  4-5.  Sample  variogram  points  for  ordinary-least  squares  trend  residuals  of  Saratoga  data  binned  to  4  miles 
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variogram.  At  smaller  lags,  the  numbers  of  data 
pairs  per  bin  can  be  nearer  the  minimum  N(hi)  to 
define  more  bins.  At  larger  lags,  a  smaller  number 
of  wider  bins  would  be  adequate.  Knowing  tiiat 
the  variogram  should  be  a  smooth  fimction,  ulti¬ 
mately  the  analyst  visually  decides  when  the  sam¬ 
ple  variogram  is  sufficiently  defined  at  all  lags  to 
adequately  approximate  a  &eoretical  variogram. 

4-5.  Transformations  and  Anisotropy 
Considerations 

a.  Transformations. 

(1)  A  transformation  is  applied  to  a  data  set 
generally  for  one  of  two  interrelated  purposes. 

First,  a  transformation  can  reduce  the  scale  of 
variability  of  highly  fluctuating  data.  This  varia¬ 
bility  would  especially  occur  witii  contaminant 
concentrations  in  which  order  of  magnitude 
changes  in  data  at  proximate  sites  are  not  uncom¬ 
mon.  The  effects  of  such  data  would  be  erratic 
sample  variogram  points  as  exhibited  by  a  large- 
amplitude,  ill-defined  sawtooth  pattern  of  the  lines 
connecting  the  points. 

(2)  Second,  a  proper  transformation  of  data 
whose  probability  distribution  is  highly  skewed 
often  produces  a  set  of  values  that  is  approxi¬ 
mately  normally  distributed  by  mitigating  the 
influence  of  problematic  extreme  data  values.  A 
data  set  with  a  normal  distribution  is  important  in 
kriging  when  confidence  levels  of  the  estimates  are 
desired.  This  usage  of  confidence  levels  in  a 
kriging  analysis  will  be  illustrated  in  Chapter  5. 

(3)  Among  file  more  common  transformations 
is  the  natural  log  transform.  As  an  example,  for 
this  transformation,  the  y  will  be  the  sample  vari¬ 
ogram  values  of  logarithms,  and  subsequent  kriged 
estimates  will  be  logarithms.  Another  transfor¬ 
mation  that  is  often  used,  especially  in  spatial 
analyses  of  contaminant  levels,  is  die  indicator 
transformation  described  in  Chapter  2.  Although  a 
transformation  might  achieve  better-behaved  sam¬ 
ple  variogram  points,  there  are  subtleties  to 


consider  in  interpreting  the  kriging  results  of  the 
transformed  data  or  in  back-transforming  kriging 
results  into  the  untransformed  (original)  units,  as 
discussed  in  Chapter  1.  If  a  satisfactory  variogram 
of  the  original  data  cannot  be  achieved  and  a  trans¬ 
formation  is  indicated,  the  sample  variogram  com¬ 
putation  process  must  begin  again  with  Equa¬ 
tion  4-2.  Even  though  no  transformation  was 
needed  for  the  Saratoga  data,  an  example  using  a 
logarithmic  transformation  and  an  example  using 
the  indicator  transformation  are  presented  in 
Chapter  5. 

b.  Directional  variograms  and  anisotropy. 

(1)  Anisotropy  in  the  data  can  be  investigated 

by  computing  sample  variograms  for  specific 
directions.  Locations  included  in  a  given  direction 
from  any  other  location  are  contained  in  a  sector  of 
a  circle  of  radius  centered  on  the  location. 

The  sector  is  specified  by  two  angular  inputs.  The 
first  is  a  bearing  defining  the  specific  direction  of 
interest  [measured  counterclockwise  fi'om  east 
(=0°)]  and  the  second  is  a  (window)  angle  defining 
an  arc  of  rotation  swept  in  both  directions  from  the 
bearing.  Thus,  in  the  terminology  used  here,  the 
total  angle  defining  a  direction  is  equal  to  twice  the 
window  angle.  Differences  in  sample  variograms 
computed  using  these  angle  windows  specified  for 
different  directions  can  be  an  indication  of 
anisotropy. 

(2)  Anisotropy  is  generally  either  geometric  or 
zonal.  Geometric  anisotropy  is  indicated  by  direc¬ 
tional  theoretical  variograms  that  have  a  common 
sill  value,  but  different  ranges.  The  treatment  of 
geometric  anisotropy  is  dependent  on  the  software 
used.  The  lags  of  the  directional  variograms  can 
be  scaled  by  the  ratio  of  their  ranges  to  the  range 
of  a  standard  or  common  variogram.  In  some 
cases,  the  lags  of  all  directional  variograms  are 
scaled  by  their  respective  ranges,  and  a  common 
variogram  with  a  range  of  1  is  used.  Groundwater 
contaminant  plumes  often  have  geometric  aniso¬ 
tropy  in  which  the  prevailing  plume  direction 
would  have  a  greater  range  than  that  of  the  transect 
of  the  plume. 
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(3)  Zonal  anisotropy  is  indicated  by  direc¬ 
tional  variograms  that  have  the  same  range  but 
different  sills.  Pure  zonal  anisotropy  is  usually  not 
seen  in  practice;  typically  it  is  found  in  combina¬ 
tion  with  geometric  anisotropy.  Such  mixed 
anisotropy  may  be  encountered  if  evaluating  the 
variograms  of  three-dimensional  HTRW  sampling 
results.  Variability  of  such  data  (as  indicated  by 
the  sill  of  the  variogram)  may  be  significantly 
higher  and  the  range  significantly  shorter  in  the 
vertical  direction  than  in  the  horizontal  direction. 

In  order  to  model  this  mixture  of  anisotropic  vari¬ 
ograms,  the  overall  variogram  is  set  to  a  weighted 
sum  of  individual  models  of  the  directional  vario¬ 
grams  scaled  by  flieir  ranges.  In  this  process, 
called  nesting,  the  choice  of  weights  requires  a  trial 
and  error  approach  with  a  constraint  that  the  sum 
of  the  weights  equals  the  sill  of  the  overall  vario¬ 
gram.  The  reader  is  referred  to  Isaaks  and 
Srivastava  (1989,  pp.  377-390)  for  further  infor¬ 
mation  on  both  types  of  anisotropy. 

(4)  For  a  given  number  of  data  locations, 
directional  sample  variograms  will  necessarily 
have  fewer  points  for  any  lag  when  compared  to 
the  points  for  the  same  lag  in  the  omnidirectional 
variogram.  Hence,  there  will  be  less  reliability  in 
the  directional-variogram  point  values,  which 
would  be  a  critical  constraining  factor  for  small 
data  sets  or  for  a  data  pattern  that  does  not  con¬ 
form  to  a  direction  of  anisotropy.  For  a  general 
idea  of  the  sufficiency  of  the  data  to  adequately 
determine  any  anisotropy,  the  computations  of 
anisotropic  sample  variograms  can  be  initially 
limited  to  two  orthogonal  directions  with  window 
angles  of  45  deg. 

(5)  Directional  sample  variograms  also  can  be 
used  to  further  delineate  nonstationarity  of  the 
spatial  mean.  If  the  omnidirectional  sample  vario¬ 
gram  indicates  a  drift  in  the  data,  the  directional 
variograms  may  determine  the  dimensionality  of 
the  drift.  That  is,  although  fliey  may  not  establish 
the  degree  of  the  polynomial  in  the  drift  equation, 
the  directional  sample  variograms  can  indicate  the 
relative  strengflis  of  the  drift  in  the  m  and  v 
directions. 


(6)  The  computed  sample  variograms  for  the 
general  north-south  and  east-west  directions  for  the 
Saratoga  data  are  shown  in  Figure  4-6.  The  north- 
south  variogram  is  specified  by  a  direction  angle  of 
90  deg  and  a  window  angle  of  45  deg.  The  north- 
south  variogram  reveals  the  preferential  north- 
south  data  alignment  by  mimicking  the  omni¬ 
directional  (direction  angle  =  0  deg  and  window 
angle  =  90  deg)  sample  variogram  of  Figure  4-3. 
The  east-west  variogram  is  specified  by  a  direction 
angle  of  0  deg  and  a  window  angle  of  45  deg.  The 
lack  of  pairs  of  locations  for  the  east-west  vario¬ 
gram  precludes  a  good  analysis  for  this  direction, 
but  the  overlap  of  the  few  sufficiently  defined 
variogram  points  with  the  north-south  variogram 
indicates  a  consistency  of  drift  in  the  two  direc¬ 
tions.  Because  of  this  consistency,  an  isotropic 
variogram  is  assumed  for  the  Saratoga  residuals. 
An  example  of  anisotropic  variograms  is  described 
in  Chapter  5. 

4-6.  Fitting  a  Theoretical  Variogram  to  the 
Sample  Variogram  Points 

a.  General. 

(1)  The  importance  of  adequately  defining  the 
bin  values  of  a  sample  variogram  is  substantiated 
by  the  need  to  accurately  generalize  the  data-based 
behavior  of  the  sample  variogram  by  a  theoretical 
variogram  y.  The  parameters  controlling  the  spe¬ 
cific  behavior  of  theoretical  variograms  are  the 
nugget  value,  the  range,  the  sill,  or  in  the  case  of  a 
linear  variogram,  a  slope  parameter.  Of  these 
parameters,  the  nugget  and  the  sill  can  be  related  to 
properties  and  statistics  of  the  data. 

(2)  The  nugget  is  essentially  the  extrapolation 
of  the  sample  variogram  to  a  lag  of  zero.  It 
reflects  the  uncertainty  of  the  variogram  at  lags 
that  are  much  smaller  than  the  minimum  separation 
between  any  two  data  locations.  The  nugget  value 
can  include  measurement  error  variance,  and  an 
estimate  of  this  variance  will  approximate  a  mini¬ 
mum  value  of  the  extrapolation. 
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Figure  4-6.  Initial  directional  sample  variogram  points  for  raw  Saratoga  data-A,  north-south  and  B,  east-west 
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(3)  The  sill  deterniines  the  maximum  value  of 
a  variogram  and  approximates  the  variance  of  the 
data.  However,  the  points  defining  y  take  prece¬ 
dence  over  the  sample  variance  in  locating  the  sill. 
Some  variograms  are  unbounded,  and  others  may 
only  reach  a  sill  value  asymptotically.  A  defined 
sill  allows  conversion  of  die  variogram  to  a  covari¬ 
ance  function  using  Equation  2-27,  which  is  gen¬ 
erally  done  because  computations  in  the  kriging 
algorithms  are  more  efficiently  performed  using  a 
covariance  flmction. 

(4)  Fitting  a  function  to  the  sample  variogram 
values  can  range  from  a  visual  fit  to  a  sophisti¬ 
cated  statistical  fit.  A  statistical  fit  is  an  objective 
method  as  long  as  the  choice  of  bins  and  weighting 
of  the  sample  variogram  points  remain  fixed. 
However,  because  the  inputs  will  vary  with  investi¬ 
gators,  inherent  subjectivity  persists  as  in  a  visual 
fit.  A  final  calibration  of  the  variogram  param¬ 
eters  would  be  based  on  the  kriging  algorithm  and, 
thus,  either  of  the  initial  fitting  methods  at  this 
stage  would  suffice. 

(5)  Because  the  initial  part  of  the  variogram 
has  the  most  effect  on  subsequent  kriging  output,  a 
good  estimate  of  the  nugget  value  becomes  a  most 
important  first  step.  The  range  and  the  sill,  in  that 
order,  complete  the  ranking  of  the  influence  of 
variogram  parameters  on  the  output  of  a  geostatis- 
tical  analysis.  Whatever  the  fitting  method  used, 
the  theoretical  variogram  needs  to  be  supported  by 
the  sample  variogram  values.  For  variograms  with 
a  range  parameter,  this  siq)port  should  extend  to 
the  range.  Joumel  and  Huijbregts  (1978)  suggest 
diat  this  support  should  be  through  one-half  the 
dimension  of  the  field  or  essentially  through  one- 
half  the  maximum  lag  distance  of  the  sample  data. 

(6)  Most  geostatistical  studies  can  be  success¬ 
fully  completed  using  the  following  four  singular 
theoretical  variogram  forms:  exponential,  spheri¬ 
cal,  Gaussian,  and  linear  functions  (Figure  2-3). 
For  the  example  variogram  determination 
described  in  this  section,  only  one  of  these  singular 
forms  will  be  selected;  however,  positive  linear 
combinations  of  these  forms  also  are  acceptable  as 
theoretical  variograms  (see  section  4-5b). 


Geometric  relationships  to  aid  in  obtaining  param¬ 
eters  for  the  four  variogram  forms  are  described  in 
the  following  sections  and  are  illustrated  in  Fig¬ 
ure  2-3  for  reference. 

b.  Exponential  variogram. 

The  exponential  variogram  (Equation  2-23)  is 
specified  by  the  nugget  g,  sill  s,  and  a  practical 
range  value  r.  The  range  is  qualified  as  practical 
because  the  sill  is  reached  only  asymptotically. 

The  initial  behavior  of  the  exponential  variogram  is 
different  from  the  behavior  of  the  spherical  vario¬ 
gram  in  tiiat  the  convex  behavior  extends  to  the 
nugget  value  (Figure  2-3).  Again,  a  nugget  value 
and  a  sill  value  are  first  specified  based  on  the  y 
points.  The  practical  range  is  chosen  so  that  the 
value  of  the  resulting  exponential  function  evalu¬ 
ated  at  the  practical  range  lag  is  95  percent  of  the 
sill  value.  The  specified  exponential  function 
would  mesh  with  the  sample  variogram  points  at 
least  through  the  practical  range  lag.  An  initial 
estimate  of  the  practical  range  can  be  made  by 
checking  if  the  intersection  of  the  sill  value  with  a 
line  tangent  to  the  variogram  at  the  nugget  is  at  a 
lag  value  equal  to  one-third  of  the  assumed  prac¬ 
tical  range  value  as  illustrated  in  Figure  2-3. 
Examples  of  the  exponential  variogram  may  be 
formd  in  spatial  studies  of  sulfate  and  total  alka¬ 
linity  in  groundwater  systems  (Myers  et  al.  1980). 

c.  Spherical  variogram.  The  spherical  vario¬ 
gram  parameters  (Equation  2-24)  are  a  nugget 
value  g,  a  range  r,  and  a  sill  s.  At  smaller  lag 
values  the  sample  variogram  points  indicate  linear 
behavior  from  the  nugget  that  then  becomes  con¬ 
vex  and  reaches  a  sill  value  at  some  finite  lag 
(Figure  2-3).  A  sill  is  estimated,  and  a  line  ^wn 
through  the  points  of  die  initial  linear  part  of  the 
variogram  would  intersect  the  sill  at  a  lag  value 
approximately  equal  to  two-thirds  of  the  range. 
With  these  estimates  of  the  parameters,  a  spherical 
variogram  is  defined  that  should  be  supported  by 
the  sample  variogram  points.  If  the  spherical  plot 
does  not  fall  near  the  sample  variogram  points, 
adjustments  need  to  be  made  to  the  parameter  esti¬ 
mates  and  the  subsequent  fit  evaluated.  Although 
the  spherical  variogram  is  one  of  the  most  often 
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used  models  for  real  valued  spatial  studies,  it 
seems  to  be  a  predominant  model  for  indicator 
values  at  various  cutoff  levels  as,  for  example,  in  a 
study  of  lead  contamination  (Joumel  1993). 

d.  Gaussian  variogram.  The  Gaussian  vario- 
gram  parameters  (Equation  2-25)  are  a  nugget 
value  g,  and  a  sill  s,  and  this  variogram  also  has  a 
practical  range  r.  The  Gaussian  variogram  is  hori¬ 
zontal  from  the  nugget,  becomes  a  concave  upward 
function  at  small  lags,  inflects  to  concave  down¬ 
ward,  and  asymptotically  approaches  a  sill  value 
(Figure  2-3).  After  a  nugget  value  and  sill  value 
are  specified  based  on  the  points,  the  variogram 
value  at  a  lag  of  one-half  flie  estimated  practical 
range  will  be  two-thirds  of  the  sill  value.  Again, 
this  fitted  variogram  needs  to  be  supported  by  the 

Y  points  to  a  reasonable  degree.  As  will  be 
described  in  the  example  using  the  Saratoga  data, 
the  Gaussian  variogram  often  is  used  where  die 
variable  analyzed  is  spatially  very  continuous, 
such  as  a  groundwater  potentiometric  surface. 

e.  Linear  variogram.  Parameters  for  a  linear 
variogram  (Equation  2-26)  are  a  nugget  value  g, 
and  a  slope  b.  Sample  points  indicating  a  linear 
variogram  would  increase  linearly  from  the  nugget 
value  and  fail  to  reach  a  sill  even  for  large  lags 
(Figure  2-3).  With  the  nugget  as  the  intercut,  the 
slope  is  computed  for  the  line  passing  through  the 

Y  points.  A  pseudosill  s  can  be  defined  as  the 
value  of  the  line  at  the  greatest  lag,  between 
any  two  locations.  This  lag  becomes  the  defacto 
range  r  for  a  linear  variogram.  Examples  of  the 
usage  of  the  linear  variogram  occur  in  hydrogeo¬ 
chemical  studies  of  specific  conductance  and  in 
studies  of  trace  elements  such  as  barium  and  boron 
(Myers  et  al.  1980). 

4-7.  Additional  Trend  Considerations 

a.  If  a  drift  in  the  data  is  indicated  as  in  sec¬ 
tion  4-3,  the  theoretical  variogram  of  residuals 
that  has  been  fitted  thus  far  is  used  to  update  the 
drift  equation.  Although  ordinary  least  squares 
often  suffices  for  computing  a  polynomial  drift 
equation,  drift  determination  itself  is  a  fimction  of 


Y  when  the  data  are  spatially  correlated.  But  y 
carmot  be  estimated  until  a  drift  equation  is 
obtained  to  yield  the  residuals.  Therefore,  obtain¬ 
ing  a  sample  variogram  and  a  subsequent  theoreti¬ 
cal  variogram  from  drift  residuals  of  a  specified 
drift  form  is  an  iterative  process  (David  (1977), 
pp.  273-274)  flamed  by  fee  following  steps: 

(1)  An  initial  variogram  is  specified  and  drift 
coefficients  are  computed  to  obtain  residuals.  For 
this  step,  a  pure  nugget  (i.e.  constant)  variogram 
can  be  used  to  compute  fee  initial  estimates  of  fee 
drift  coefficients.  This  is  an  ordinary  least-squares 
estimate  of  fee  drift  yielding  a  first-iteration  sam¬ 
ple  variogram  of  residuals. 

(2)  A  theoretical  variogram  is  fitted  to  fee 
sample  variogram  of  fee  residuals  and  is  used  to 
obtain  updated  drift  coefficients. 

(3)  The  residuals  from  fee  drift  obtained  in 
step  b  are  used  to  compute  an  updated  sample 
variogram. 

(4)  The  sample  variogram  computed  at  fee  end 
of  step  3  is  compared  to  fee  sample  variogram  of 
step  2.  If  the  two  sample  variograms  compare 
favorably,  then  fee  theoretical  variogram  from 
step  2  is  accepted  as  the  variogram  of  residuals  for 
subsequent  ktiging  computations.  If  fee  sample 
variogram  from  step  3  differs  markedly  from  fee 
sample  variogram  of  step  2,  steps  2-4  are  repeated 
using  fee  sample  variogram  of  fee  most  recent 
step  c. 

b.  Generally,  fee  plot  of  the  points  of  y  from 
a  set  of  residuals  will  initially  increase  wife  h, 
reach  a  maximum,  and  then  decrease  as  seen  in 
Figure  4-4.  This  typical  haystack-type  behavior, 
discussed  by  David  (1977,  pp.  272-273),  is  attri¬ 
buted  to  a  bias  resulting  from  fee  estimation  error 
in  fee  drift  form  and  its  coefficients.  Thus,  this 
behavior  in  fee  variogram  of  fee  residuals  gen¬ 
erally  would  more  readily  occur  wife  a  higher 
degree  of  drift  polynomial.  This  behavior  should 
not  prohibit  acceptable  variogram  determination 
because  fee  initial  points  of  fee  sample  variogram 
of  residuals  are  still  indicative  of  fee  theoretical 
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variogram.  For  example,  tiie  lag  associated  with 
the  maximum  of  y  of  the  residuals  can  be  a  good 
first  approximation  for  the  range  of  the  theoretical 
variogram. 

4-8.  Outlier  Detection 

a.  Outliers  in  a  data  set  can  have  a  substantial 
adverse  effect  on  y.  However,  divergent  data 
values  can  be  screened  for  evaluation  using  a 
Hawkins  statistic  (Hawkins  1980),  which  is 
described  in  the  context  of  kriging  by  Krige  and 
Magri  (1982).  A  neighborhood  containing  4  to  10 
data  points,  approximately  normally  distributed, 
around  each  suspected  outlier  must  be  defined. 
Despite  potential  outliers  in  the  data  set,  a  best 
guess  initial  theoretical  variogram  also  is  needed. 

b.  The  Hawkins  statistic  is  obtained  by  com¬ 
paring  a  suspect  datum  to  the  mean  value  of  tiie  4 
to  10  surrounding  data,  the  smaller  number  being 
sufficient  if  the  variability  is  lower.  The  spacing 
between  these  surrounding  points  is  accounted  for 
by  the  properties  of  the  chosen  variogram.  A  value 
for  the  statistic  of  3.84  or  higher  would  indicate  an 
outlier  on  the  basis  of  a  95-percent  confidence 
interval.  A  larger  number  of  surrounding  points 
has  the  direct  effect  of  increasing  the  magnitude  of 
the  statistic.  Anomalous  points  are  removed  fi-om 
the  data  set  and  the  procedures  described  for 
obtaining  the  sample  variogram  are  repeated  for 
the  smaller  data  set.  There  were  no  outlier  prob¬ 
lems  in  the  Saratoga  data. 

c.  There  is  debate  among  geostatisticians 
regarding  the  merit  of  automated  outlier-detection 
mefliods.  A  procedure  such  as  that  described  here 
is  presented  as  an  investigative  tool  wifli  the  under¬ 
standing  that  the  investigator  will  also  use  atten¬ 
dant  justification  along  with  a  Hawkins-type 
statistic  to  ultimately  decide  if  a  data  value  is 
discarded  as  a  true  outlier  or  retained  as  a  valid 
observation.  In  some  situations,  highly  problem¬ 
atic  data  values  are  removed  for  computation  of 
the  sample  variogram  points  but  are  reinstated  for 
kriging. 


4-9.  Cross-Validation  for  Model 
Verification 

a.  General. 

(1)  Parameters  of  the  theoretical  variogram 
obtained  from  the  initial  fitting  and  refinement  of 
the  sample  variogram  are  calibrated  using  a  krig¬ 
ing  cross-validation  technique.  In  this  procedure, 
the  fitted  theoretical  variogram  is  used  in  a  kriging 
analysis  in  which  data  values  are  individually  sup¬ 
pressed  and  estimates  made  at  the  location  using 
subsets  of  the  remaining  points.  As  described  in 
section  4-3,  these  subsets  are  the  data  points  in  a 
moving  neighborhood  surrounding  the  point  under 
consideration.  The  calibration  estimate  made  at 
each  data  location  requires  a  matrix  inversion, 
which  could  be  very  time-consuming  if  all  remain¬ 
ing  data  locations  were  used  to  constract  the 
matrices  rather  than  just  those  within  a  neighbor¬ 
hood  of  a  limited  search  radius. 

(2)  After  kriged  values  at  all  data  locations 
have  been  estimated  in  the  above  manner,  the  data 
are  used  with  their  kriged  values  and  kriging  stan¬ 
dard  deviation  to  obtain  cross-validation  statistics. 
A  successful  calibration  is  based  on  criteria  for 
these  statistics,  which  are  described  in  the  next 
section.  If  the  criteria  cannot  be  reasonably  met  by 
adjusting  the  parameters  in  the  given  theoretical 
variogram  function,  then  calibration  should  be 
reinitialized  with  a  different  theoretical  variogram 
function.  In  some  data  sets  with  nonstationary 
spatial  means,  the  drift  polynomial  may  have  to  be 
changed  as  well  as  the  variogram  to  achieve  a 
satisfactory  calibration. 

b.  Calibration  statistics. 

(1)  The  kriging  cross-validation  error  e,-  cor¬ 
responding  to  measurement  z(Xj)  is  defined  as 

e,  =  z  (i)  -  z  (i) 

where  z  (i,)is  the  kriged  estimate  of  z  (i  )  based 
on  the  remaining  n-1  measurements  in  the  data  set. 
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The  kriged  estimate  is  obtained  by  ordinary  kriging 
if  the  spatial  mean  is  constant  or  by  universal  loig- 
ing  if  the  spatial  mean  is  not  stationary.  A  reason¬ 
able  criterion  for  selecting  a  theoretical  variogram 
would  be  to  minimize  the  squared  errors,  ^  ef , 
with  respect  to  the  variogram  parameters.  How 
ever,  unlike  ordinary  least-squares  regression, 
which  also  minimizes  the  sum  of  squared  errors, 
simply  minimizing  the  squared  errors  is  not  suffi¬ 
cient  for  kriging  because  the  resulting  model  can 
yield  highly  biased  estimates  of  the  kriging  vari¬ 
ances,  ol  (ac),  where  (ac.)is  the  kriging  vari¬ 
ance  at  location  x,.  This  simple  minimization 
would  give  unrealistic  measures  of  the  accuracy  of 
the  kriging  estimates.  To  guard  against  such  bias, 
an  expression  for  the  square  of  a  reduced  kriging 
error  is  defined: 


where  the  kriging  variances  are  computed  using 
either  Equation  2-36  or  2-47.  If  the  kriging  vari¬ 
ance  is  an  unbiased  estimate  of  the  true  mean- 
squared  error  of  estimate,  then  the  reduced  kriging 
errors  would  have  an  average  near  one.  Therefore, 
the  standard  cross-validation  procedure  for  evalu¬ 
ating  a  theoretical  variogram  is: 
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(2)  The  expression  to  be  minimized  is  called 
the  kriging  root-mean-squared  error  and  the  con¬ 
straint  is  called  the  reduced  root-mean-squared 
error.  The  reduced  root-mean-squared  error 
should  be  well  within  the  interval  having  endpoints 
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1978).  An  additional  check  on  the  cross-validation 


results  is  the  unbiasedness  condition  where 


-E«,  -  0. 


(3)  As  indicated  in  Chapter  2,  if  probabilistic 
statements  concerning  an  actual  value  ofZat  an 
unmeasured  location  are  to  be  made  relative  to  the 
kriged  estimate  and  the  kriging  variance  at  the 
location,  it  is  necessary  to  explore  the  distribution 
of  the  cross-validation  kriging  errors.  In  particu¬ 
lar,  it  is  desirable  that  the  reduced  errors,^ 
=l,2...,n,  are  approximately  normally  distributed 
with  mean  0  and  variance  1.  A  histogram  or  nor¬ 
mal  probability  plot  of  the  reduced  kriging  errors 
can  be  used  to  assess  the  validity  of  assuming  a 
standard  normal  distribution  for  the  reduced  krig¬ 
ing  errors.  Additionally,  if  the  distribution  of 
reduced  kriging  errors  can  be  assumed  to  be  stan¬ 
dard  normal,  outliers  not  detected  using  the  method 
discussed  in  section  4-7  may  be  detected  by  com¬ 
paring  the  absolute  values  of  the  reduced  kriging 
errors  to  quantiles  of  the  standard  normal 
distribution. 

(4)  Using  the  Saratoga  data,  a  spherical  vario¬ 
gram  was  fitted  to  the  refined  sample  variogram  of 
the  residuals.  The  estimated  nugget  was  about 
1.49  m^  the  sill  was  133.8  m^,  and  the  range  was 
about  48  km.  Because  of  difficulty  in  determining 
an  exact  extrapolated  value  for  the  nugget,  the 
value  of  1.49  m^  was  selected  based  on  an  esti¬ 
mated  measurement  error  related  to  obtaining 
water  levels  at  the  well  depths  in  the  Saratoga 
valley. 

(5)  After  two  iterations  using  drift  residuals, 
as  described  in  section  4-7,  a  final  variogram  was 
chosen  with  a  nugget  of  1.49  m^,  a  sill  of  148.6  m^, 
and  a  range  of  44.8  km  (Figure  4-7).  These 
parameters  defined  the  theoretical  variogram  used 
to  obtain  the  cross-validation  errors  using  univer¬ 
sal  kriging  with  an  assumed  linear  drift.  The  best 
combination  of  statistics  that  could  be  obtained 
after  several  attempts  at  refining  the  model  were  a 
root-mean-squared  error  of  3.45  m  and  a  reduced 
root-mean-squared  error  of  0.5794.  The 
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Figure  4-7.  Sample  variogram  points  and  theoretical  spherical  fit  for  iterated  Saratoga  residuals 
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reduced-root-mean-squared  error  is  too  small, 
indicating  that  the  kriging  variances  produced  by 
the  model  are  too  large  compared  to  the  actual 
squared  errors.  This  fact,  coupled  with  the  rather 
large  root-mean-squared  error,  makes  the  theo¬ 
retical  variogram  model  unacceptable.  In  sec¬ 
tion  4-9c,  a  Gaussian  variogram  is  fitted  to  the 
data  that  produces  much  better  cross-validation 
results  than  the  results  for  the  spherical  variogram. 

c.  Variogram-parameter  adjustments. 

(1)  If  any  of  the  cross-validation  statistics 
vary  unacceptably  from  their  suggested  values, 
minor  adjustments  to  the  variogram  parameters 
can  be  made  to  attempt  to  improve  the  statistics. 
Whatever  modifications  are  made  to  the  param¬ 
eters,  they  should  not  have  to  be  so  severe  that  the 
variogram  function  drastically  deviates  from  the 
sample  variogram  points.  If  the  support  of  the 
sample  variogram  points  is  compromised  in  order 
to  achieve  acceptable  cross-validation  results  with 
the  given  drift-variogram  model,  a  different  drift- 
variogram  combination  should  be  investigated. 

(2)  A  reduced  root-mean-squared  error  that  is 
unacceptable  may  be  improved  upon  by  adjusting 
the  range  parameter  or  the  nugget  value  of  the 
variogram.  Modifying  the  range  parameter  should 
be  considered  first  and  any  shifts  in  flie  nugget 
value  should  be  minimal  and  made  only  as  a  final 
recourse.  Calibration  errors  are  relatively  insen¬ 
sitive  to  minor  adjustments  of  the  sill. 

(3)  If  the  reduced  root-mean-squared  error  is 
too  small,  as  in  the  Saratoga  example,  extending 
the  range  (equivalent  to  decreasing  the  slope  for  a 
linear  variogram)  will  decrease  the  kriging  vari¬ 
ance  and  thus  increase  the  reduced  root-mean- 
squared  error.  If  a  shift  in  the  nugget  value  is 
required,  a  decrease  in  the  nugget  will  reduce  flie 
kriging  variance.  If  the  reduced  root-mean- 
squared  error  is  too  large,  then  a  contraction  of  the 
range  or  a  positive  shift  in  the  nugget  value  can  be 
made,  keeping  in  mind  the  above  caveat  of  priority 
and  extent  of  the  changes.  Changes  in  these 


parameters,  generally,  also  will  have  an  effect  on 
the  mean-squared  error.  The  larger  the  nugget  is 
as  a  percentage  of  the  sill,  the  larger  the  mean- 
squared  error  will  be.  In  general,  improvements  in 
one  statistic  are  usually  made  at  the  expense  of  the 
other  statistics.  The  optimization  of  the  statistics 
as  a  set  is,  in  effect,  a  trial  and  error  procedure  that 
is  operationally  convergent. 

(4)  Reduced  kriging  errors  may  not  approxi¬ 
mate  a  standard  normal  distribution.  If  this  is  the 
case,  a  transformation  of  the  data  may  be  needed 
to  achieve  a  more  normal  distribution,  and  the 
variogram  estimation  procedure  would  be  repeated. 

(5)  Because  no  convergence  could  be  reached 
for  parameter  values  of  a  spherical  variogram  for 
the  Saratoga  data,  a  Gaussian  theoretical  vario¬ 
gram  was  fitted  to  the  sample  variogram  of 
residuals  in  Figure  4-4.  This  choice  was  made 
because  the  initial  sample  variogram  points  could 
be  interpreted  to  have  a  slight  upward  concavity, 
but  eventually  reached  a  sill.  This  behavior  can  be 
attributed  to  correlation  rather  than  to  further  drift. 
After  an  iterated  cross-validation  with  the  Gaus¬ 
sian  parameters,  a  Gaussian  variogram  with  a 
nugget  of  1.49  m^,  a  sill  of  185.81  m^,  and  a  range 
of  27.52  km  (Figure  4-8)  yielded  a  root-mean- 
squared  error  of  2.33  m  and  a  reduced-root-mean- 
squared  error  of  1.083.  The  mean  cross-validation 
error  is  0.0195  m.  These  values  represent  an 
improvement  over  the  spherical  variogram  and 
were  deemed  acceptable  for  the  Gaussian 
variogram. 

(6)  A  probability  plot  of  the  reduced  kriging 
errors  using  the  final  Gaussian  variogram  is  shown 
in  Figure  4-9.  It  is  reasonably  linear  between  two 
standard  deviations  and,  thus,  approximates  a 
standard-normal-distribution  function.  Finally,  a 
plot  in  Figure  4-10  of  the  data  versus  their  kriged 
estimates  indicates  that  the  linear  drift-Gaussian 
variogram  model  selected  for  the  Saratoga  data 
would  produce  accurate  estimates  of  groundwater 
elevations  for  interpolation  or  contour  gridding  in 
the  region. 
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Figure  4-8.  Sample  variogram  points  and  theoretical  Gaussian  fit  for  iterated  Saratoga  residuals 
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Figure  4-9.  Cross-validation  probability  plot  for  Saratoga  data 
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Figure  4-10.  Scatter  plot  of  measured  versus  kriging  estimates  from  cross-validation  of  Saratoga  data 
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Chapter  5 

Practical  Aspects  of  Geostatistics  in 
Hazardous,  Toxic,  and  Radioactive 
Waste  Site  Investigations 


5-1.  General 

a.  In  this  chapter,  several  example  applica¬ 
tions  are  described.  The  applications  have  been 
developed  using  hydrologic,  geologic,  and  contami¬ 
nant  data  from  established  and  well-studied  haz¬ 
ardous  waste  sites.  The  real  nature  of  the  data 
permits  discussion  of  some  problems  that  can 
occur  during  HTRW  site  investigations  that  stem 
not  only  from  natural  field  conditions,  but  also 
from  typical  problems  that  are  associated  with  the 
types  of  data  involved.  In  addition,  the  real  nature 
of  the  example  data  provides  an  opportunity  for 
comparison  between  kriging  estimates  and  the  real 
data.  In  accordance  with  the  purpose  and  scope  of 
this  ETL,  these  comparisons  will  be  brief  and 
general.  This  ETL  does  not  provide  the  compre¬ 
hensive  analysis  of  data  that  is  addressed  by  other 
more  elaborate  studies. 

b.  The  principal  intent  of  the  examples  is  to 
provide  systematic  descriptions  for  a  few  of  the 
large  number  of  possible  types  of  applications  that 
investigators  may  use  during  HTRW  site  investi¬ 
gations.  The  examples  are  not  intended  to  provide 
guidance  for  comprehensive  analysis  of  the 
included  data.  This  ETL  will,  however,  present 
some  fundamental  problems  that  can  occur  in 
geostatistical  applications  and,  in  some  examples, 
indicate  some  possible  alternatives. 

c.  With  each  example,  a  purpose  will  be 
established  and  a  general  environmental  setting 
wiU  be  given.  Most  aspects  of  variogram  con¬ 
struction  and  calibration  will  be  briefly  described 
and  illustrated  graphically  and  in  tabidar  form.  A 
comprehensive  treatment  of  variogram  constrac- 
tion  has  been  presented  in  Chapter  4. 

d.  GEO-EAS  software  has  been  used  when¬ 
ever  the  example  data  did  not  need  universal  krig¬ 
ing;  for  those  examples,  STATPAC  was  used.  As 


indicated  in  Chapter  3,  both  of  these  software 
packages  run  on  the  DOS  platform  (Table  3-1), 
which  will  probably  be  most  convenient  to  readers. 
The  results  of  kriging  estimates  are  portrayed  by 
gray-scale  maps  rather  than  by  contours  because 
of  the  objective  nature  of  the  gray-scale  format. 
North  is  at  the  top  of  all  maps  presented  in  this 
ETL,  although  this  orientation  may  represent  some 
deviation  from  the  real  data. 


5-2.  Water-Level  Examples 

a.  The  following  examples  are  for  ground- 
water  levels.  The  principal  purpose  of  the  exam¬ 
ples  is  to  expose  the  reader  to  a  kriging  exercise 
using  groundwater  levels  and  to  indicate  how,  in  a 
simple  manner,  kriging  standard  deviations  may  be 
useful  to  investigators  interested  in  evaluating 
monitoring  networks.  The  data  come  from  a 
water-table  setting  in  unconsolidated  sediments 
where  the  local  relief  for  the  land  surface  is  about 
30  m.  The  data  involved  in  tins  example  are  con¬ 
sidered  virtually  free  of  actual  measurement  error. 

b.  The  location  of  measured  water  levels  is 
shown  in  Figure  5-la  and  the  basic  univariate 
statistics  for  this  data  set  are  listed  in  Table  4-1; 
modifications  to  the  measured  data,  in  die  form  of 
addition  and  removal  of  measured  values,  are 
shown  in  Figures  5-lb  and  5-lc.  The  techniques 
described  in  Chapter  4  were  used  to  guide  the 
following  steps  for  variogram  construction: 

(1)  A  raw  variogram  analysis,  along  with 
basic  hydrologic  knowledge  of  water-level  behav¬ 
ior,  indicated  that  universal  kriging  would  be 
needed  for  fliis  analysis. 

(2)  To  obtain  a  stable  variogram  of  residuals, 
an  iterative,  generalized  least-squares  operation 
was  initially  used  to  remove  prominent  linear  drift 
of  the  form  a  +  bu  +  cv,  observed  in  the  measured 
water  levels. 

(3)  After  drift  was  removed,  residuals  were 
determined  to  be  stationary  and  universal  kriging 
with  a  linear  drift  was  appropriate. 
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Figure  5-1.  Measured  data  for  water-level  examples-A,  original  data;  B,  original  data  without  dropped  sites; 
C,  original  data  with  added  sites  (added  sites  indicated  with  *)  (Sheet  1  of  3) 


A  Gaussian  model  was  used  to  fit  the  stabilized  listed  in  Table  5-1.  Cross-validation  statistics 

variogram  of  residuals  (Figure  5-2a),  which  has  a  conform  to  the  criteria  discussed  in  Chapter  4. 

nugget  of  0.093  m^  a  sill  of  2.69  m^  and  a  range 

of  1 ,2 1 9  m  (Table  5- 1 ).  c.  Linear  drift  is  commonly  observed  in 

grormdwater  elevation  data  where  there  are  no 

(5)  Cross-validation  was  performed,  and  the  major  anthropogenic  activities,  such  as  large 
results  are  shown  in  Figures  5-2b  and  5-2c,  and  groundwater  withdrawals.  With  these 
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Figure  5-1.  (Sheet  2  of  3) 


circumstances  fliere  is  usually  a  fairly  uniform  and 
general  groundwater  movement  that  is  generally 
expressed  in  terms  of  direction.  This  uniform  and 
general  nature  introduces  a  nonstationary  element 
to  the  data  that,  in  geostatistics,  is  referred  to  as 
drift.  As  indicated  in  Chapter  4,  the  presence  of 
drift  is  indicated  by  a  parabolic  variogram  sh^e. 
In  this  example,  the  initial  variogram  in  the  raw 
variogram  analysis  had  a  characteristic  parabolic 


shape  and  a  linear  drift  was  identified.  Once  the 
drift  was  identified  and  characterized,  universal 
kriging  procedures  were  used. 

d.  A  Gaussian  model  is  usually  appropriate 
for  variograms  with  highly  continuous  variables 
such  as  groundwater-elevation  data,  and  it  is  par¬ 
ticularly  appropriate  in  fliis  example.  The  vario¬ 
gram  (Figure  5-2a)  at  small  lags  ^yond  the  nugget 
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Figure  5-1.  (Sheet  3  of  3) 


has  an  upward  concavity  that  cannot  be  fit  with  a 
linear,  spherical,  or  exponential  model.  The 
observed  shape  was  interpreted  as  a  function  of 
continuous  small-scale  variability.  The  Gaussian 
model  fits  the  bowl  shape  of  the  small  lag  data 
(and  other  data  to  a  lag  of  about  610  m)  well,  but 
it  is  not  flexible  enou^  to  closely  fit  the  points 
much  beyond  610  m,  indicating  that  kriging  esti¬ 


mates  should  be  computed  using  neighborhoods 
with  a  search  radius  less  than  610  m.  In  Chapter 
4,  the  initial  part  of  the  variogram  was  described 
as  having  the  most  effect  on  subsequent  kriging 
estimates. 

e.  The  established  variogram  then  was  used 
with  the  measured  data  to  produce  universal 
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Figure  5-2.  Variogram  and  variogram  cross-validation  plots  for  residuals  in  water-level  example~A,  theoretical 
variogram;  B,  cross-validation  scatterplot;  C,  cross-validation  probability  plot  (Sheet  1  of  3) 


kriging  estimates  for  all  points  in  a  26-by-26  grid 
with  a  grid  size  of  about  61-by-61  m.  A  gray¬ 
scale  map  of  the  kriging  water  levels  is  shown  in 
Figure  5-3a  and  basic  univariate  kriging  estimate 
statistics  are  listed  in  Table  5-2a  (water  level  A). 
The  kriging  results  a  are  a  good  representation  of 
the  results  from  other  more  elaborate  studies. 

/  Kriging  standard  deviations  for  the  kriging 
estimates  are  shown  in  Figure  5-3b.  The  magni¬ 
tude  of  kriging  standard  deviations  can  provide 
investigators  with  a  direct  indication  of  where  the 
imcertainty  associated  with  kriging  estimates  is 
relatively  high  or  low.  The  areas  of  greatest  uncer¬ 
tainty  for  the  kriged  water  levels  are  in  the  upper 
right  and  lower  left  comers  of  the  map,  where 
standard  deviations  are  as  high  as  about  1.4  and 
0.8.  Not  surprisingly,  these  areas  are  where  the 
density  of  the  measured  data  is  relatively  low. 
Throughout  much  of  the  remainder  (about  70  per¬ 
cent)  of  the  map,  the  kriging  standard  deviation  is 
almost  constant  at  about  0.35. 


g.  To  use  the  kriging  standard-deviation 
values  in  a  more  quantitative  matmer,  the  investi¬ 
gator  needs  to  establish  some  assurance  that  the 
measured  data  and  the  reduced  kriging  errors  are 
approximately  normally  distributed  and  also  that 
the  assumption  of  stationary  residuals  after  drift 
removal  is  correct.  If  the  investigator  is  confident 
about  these  assumptions,  then  the  basic  statistical 
principles  involving  confidence  intervals  can  be 
applied.  In  this  example,  the  standard  deviation  of 
about  0.35  throughout  most  of  the  map  indicates 
that  there  is  a  95-percent  chance  that  die  trae  value 
at  a  location  where  there  is  a  kriging  estimate  will 
be  within  about  0.70  (twice  the  kriging  standard 
deviation)  of  the  kriging  estimate. 

A.  As  an  example  of  evaluating  network 
density  and  the  accuracy  of  kriging  estimates,  two 
new  maps  were  developed.  To  make  the  first  map, 
a  decrease  in  network  density  was  effected  by 
removing  nine  measured  locations  from  the  north¬ 
west  part  of  the  area  (Figure  5- lb)  where  sampling 
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Figure  5-2.  (Sheet  2  of  3) 


density  was  high  and  kriging  standard  deviations 
were  low.  Kriging  estimates  were  produced  for  die 
same  grid  and  the  basic  univariate  kriging  estimate 
statistics  are  listed  in  Table  5-2  (water  level  B). 
The  map  shown  in  Figure  5-3c  indicates  that  the 
ratio  of  die  original  kriging  standard  deviations  and 
the  kriging  standard  deviations  with  the  nine  mea¬ 
sured  locations  removed  is  always  very  close  to 
1.00,  which  indicates  that  there  is  very  little  dif¬ 
ference  between  the  two  sets  of  kriging  standard 
deviations  and  that  water  levels  are  oversampled  in 
the  area  where  the  nine  measured  locations  were 
removed. 

i.  To  produce  the  second  map  (Figure  5-lc) 
nine  locations  were  added  in  the  southwest  comer 
where  die  sampling  density  was  relatively  low  and 
the  kriging  standard  deviation  was  relatively  high. 
In  section  2-4,  Equation  2-47  indicates  that  the 
universal  kriging  variance  depends  on  the  vario- 
gram,  the  type  of  trend,  and  measurement  loca¬ 
tions;  in  this  respect  the  kriging  standard  deviation 
does  not  depend  on  the  values  at  measurement 


locations.  Consequently,  values  of  zero  were  used 
for  the  nine  new  measurement  locations  and  only 
the  resultant  map  of  kriging  standard  deviations 
(Figure  5-3d)  is  of  interest.  The  map  shows  that 
the  kriging  standard  deviations  in  the  lower  left 
comer,  which  formerly  had  values  of  about  0.8, 
have  been  decreased  by  a  factor  of  approximately 
0.25,  which  indicates  that  the  kriging  estimates, 
based  on  the  geometry  of  the  network,  are  more 
reliable. 


5-3.  Bedrock-Elevation  Examples 

a.  The  following  examples  are  for  bedrock 
elevations.  The  principal  purposes  of  the  examples 
are  to  familiarize  the  reader  with  a  kriging  exercise 
using  bedrock  elevations  and  to  describe  block 
kriging.  The  data  come  from  an  area  where  bed¬ 
rock  consists  of  a  series  of  intercalated  terrestrial 
deposits  that  have  been  weathered  somewhat  and 
then  covered  with  alluvium.  The  opportunity  for 
measurement  error  in  these  types  of  data  is 
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Figure  5-2.  (Sheet  3  of  3) 

inevitable  because  the  determination  of  just  where 
bedrock  begins  is  complicated  and  subjective. 

b.  The  set  of  measured  locations,  set  A,  is 
shown  in  Figure  5-4a  and  the  basic  univariate  sta¬ 
tistics  are  listed  in  Table  4-1  (bedrock  A);  modifi¬ 
cations  to  the  measured  data,  in  the  form  of 
removal  of  sites  is  shown  in  Figure  5-4b.  The 
techniques  described  in  section  4-1  were  used  to 
guide  die  following  steps  for  variogram 
construction: 

(1)  The  raw  variogram  indicated  a  stationary 
spatial  mean.  The  data  were  assumed  to  be  suit¬ 
able  for  ordinary  kriging. 

(2)  An  isotropic  Gaussian  model  was  used  to 
fit  the  variogram  which  had  a  nugget  of  0.650  m^, 
a  sill  of  12.54  m^,  and  a  range  of  914  m 

(Table  5-1,  bedrock  A). 


(3)  Cross-validation  was  performed,  and  the 
results,  (Table  5-1,  bedrock  A),  were  not 
acceptable. 

c.  The  cross-validation  exercise  produced  a 
reduced-root-mean-squared  error  of  2. 146 
[Table  5-1  (bedrock  A)]  which  indicates,  as 
described  in  Chapter  4,  that  the  kriging  variance  is 
underestimated  to  an  unsatisfactory  degree.  Fur¬ 
ther  attempts  to  fit  the  Gaussian  model  to  the 
sample  variogram  points  produced  better  cross- 
validation  statistics;  however,  the  Gaussian  curve 
began  to  depart  substantially  from  the  sample 
variogram  points  at  the  lower  lag  sample  points. 
As  a  result,  the  distribution  of  the  residuals  was 
explored,  and  the  eastern,  and  especially  norfli- 
eastem,  parts  were  determined  to  contain  prob¬ 
lematic  data  values  that  rendered  the  distribution 
nonhomogeneous.  The  nonhomogeneous  nature  is 
related  to  an  incised  channel  present  on  the 
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Figure  5-3.  Kriging  results  for  water-level  examples-A,  kriging  estimates  for  original  data;  B,  kriging  standard 
deviations  for  original  data;  C,  ratio  (original  data  to  original  with  dropped  sites)  of  kringing  standard 
deviations;  D,  kriging  standard  deviations  for  original  data  with  added  sites  (Sheet  1  of  4) 


bedrock  surface.  At  this  juncture,  die  measured 
data  were  restricted  to  exclude  the  outlying  mea¬ 
surements.  Before  this  decision  was  made,  two 
alternative  methods  for  dealing  with  the  outlying 
values  were  considered  and  deemed  beyond  the 
scope  of  fliis  effort  However,  a  brief  discussion  of 
the  situation  is  appropriate. 


d.  The  first  alternative  considered  was  to  fit  a 
contrived  and  nongradual  surface  to  the  measured 
data  and  remove  the  outlier  effect  A  splined  sur¬ 
face  could  be  capable  of  producing  flie  desired 
result  The  decision  whether  or  not  to  pursue  such 
a  remedy  becomes  somewhat  philosophical.  In  a 
relatively  simple  example,  as  in  this  bedrock 
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Figure  5-3.  (Sheet  2  of  4) 

example,  such  a  remedy  may  be  entirely  appro¬ 
priate;  however,  some  investigators  may  support 
the  idea  that  the  situation  is  actually  dealing  with 
two  unique  and  homogeneous  domains.  Therefore, 
the  second  alternative  considered,  distributing  the 
kriging  process  so  that  each  homogeneous  domain 
is  addressed  independently,  becomes  more  attrac¬ 
tive.  In  more  complicated  applications  where  a 
large  number  of  domains  are  present,  a  distributed 


approach  may  be  necessary  to  avoid  an  undue 
amount  of  compromise. 

e.  The  restriction  of  measured  data,  set  B,  is 
shown  in  Figure  5-4b  and  flie  basic  univariate  sta¬ 
tistics  are  listed  in  Table  4-1  (bedrock  B).  The 
restriction  exercise  resulted  in  removing  17  meas¬ 
ured  locations  and  in  the  truncation  of  the  north¬ 
eastern  part  of  the  area  so  that  the  area  became 
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polygonal  rather  than  rectangular.  Again,  the  tech¬ 
niques  described  in  Chapter  4  were  used  to  guide 
the  following  steps  for  variogram  construction: 

(1)  A  Gaussian  model  was  used  to  fit  the  vari¬ 
ogram  which  had  a  nugget  of  0.650  m^,  a  sill  of 
8.36  m^,  and  a  range  of  732  m.  The  variogram 
indicated  a  stationary  spatial  mean. 


(2)  Initial  cross-validation  was  performed,  and 
the  nugget  was  changed  from  0.650  m^  to  0.743  m^ 
to  improve  cross-validation  statistics.  The  final 
variogram  is  shown  in  Figure  5-5a  and  character¬ 
istics  are  listed  in  Table  5-1. 


5-11 


ETL  1110-1-175 
30  Jun  97 


1  '  __ - 

D  ^ 

: 

i.w 

UI 

Imm 

IBflBLflflflflKBflMMH 

J  NOflIN 

i 

■ 

■ 

■ 

1 

li 

Bl 

II 

II 

II 

II 

II 

II 

fli 

i 

B 

B 

-Jt 

B 

B 

fl 

fl 

fl 

fl' 

fl 

fl! 

fli 

fl 

fli 

fl^ 

fli 

j 

B^ 

fli 

B 

S 

B 

"1 

fl 

fl 

fl 

fl; 

fl 

fli 

fli 

fl 

fli 

fl 

fli 

fl 

i; 

m 

fl 

fl 

fli 

fli 

fl 

fl 

fl 

fli 

1 

■ 

fli 

■ 

H 

B 

H 

fl 

1 

fl 

fli 

fl^ 

fl 

fl 

fl 

fli 

fl 

ill 

1 

s 

B 

Is 

[■ 

B 

fl 

fl 

fl 

fl 

fl 

fli 

fl 

fl 

fl 

fl 

II 

: 

■ 

fli 

fl 

ii 

if 

s 

■ 

□ 

fl 

fl 

fl 

fl 

fl 

fl: 

fl 

fl 

fl 

II 

fl 

1 

1 

■ 

H 

■; 

til 

■ 

B 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

II 

1 

fl 

1 

B 

1 

B 

IkI 

B 

B 

fl 

H 

fl 

fl 

B' 

fl 

B 

fl 

1 

fl 

1 

1 

B 

fl 

1 

B^ 

B 

fl 

B 

fl 

fl 

fl 

fl 

nn 

fl 

■ 

fl 

HI 

■ 

fl 

1 

B 

fl 

■ 

S' 

^  in 

a 

B 

B 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

1' 

1 

fl 

fl 

fl 

B 

B 

m 

■ 

B 

B 

— 

■ 

B 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

1 

1 

fl 

■ 

fl 

1 

B 

H 

■ 

B 

B 

B 

fl 

B 

fl 

fl 

fl 

fl 

1 

1 

1 

fl 

fl 

1 

fl 

B 

1 

H 

fl 

■ 

fl 

fl 

fl 

H 

fl 

fl 

fl 

■ 

B 

1 

fl 

fl 

I 

fl 

B 

B 

■ 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

H 

ll 

fl 

I 

fl 

fl 

I 

B 

fl 

fl 

m 

tg 

5 

^  570 

3T0 

m 

m 

70 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

1 

fl 

1 

1 

■ 

B 

1 

H 

■ 

a 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

1 

1 

1 

fl 

1 

I 

B 

B 

fl: 

m 

m 

fl 

B 

fl 

fl 

1 

B 

m 

1 

1 

B 

fl 

1 

ifl 

IB 

fl 

1 

B 

B 

fl 

B 

fl 

■ 

fl 

fl 

fl 

fl 

B 

fl 

fl 

fl 

fl 

B 

fl 

fl 

fl 

B 

H 

0 

B 

B 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

B 

fl 

1 

1 

1 

Ifl 

fl 

fl 

1 

1 

■ 

mM 

Wm 

m 

B 

B 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

fl 

1 

1 

fl 

11 

fl 

fl 

II 

fl 

1 

wm 

Wm 

s 

fl 

fl 

■ 

fl 

fl 

fl 

B 

■ 

fl 

fl 

1 

1 

■ 

II 

I 

Ifl 

II 

:l 

fl 

fl 

M 

fl 

■ 

■ 

fl 

fl 

fl 

fl 

fl 

■ 

Ifl 

1 

1 

■ 

ifl 

11 

II 

II 

1 

1 

;H 

■ 

fl 

fl 

■ 

fl 

fl 

fl 

fl 

fl 

m 

H 

1 

fl 

IB 

fl 

fl 

Ifl 

Ifl 

Ifl 

■ 

ill 

fl 

fl 

!■ 

fl 

fl 

ifl 

fl 

Ifl 

fl 

IB 

fl 

1 

11 

IB 

II 

II 

II 

Ifl 

IB 

I 

■ 

iB 

fl 

H 

B 

H 

ifl 

fl 

fl 

■ 

fl 

1 

n 

IB 

II 

II 

11 

II 

Ifl 

fl 

■ 

nm 

mm 

1  sBraflwv  flPflnMHi 

i  flnflVMBI  B4MMWW 

0.32 

240  390  S«0  090  840  990  1,140  U90  M40  %590 

MAP  DISTANCE.  IN  METERS 

PtPLAMAHOW 

1  1  1  1  f 

0.53  0.7S  OJO  i.l8  1-40 

iNOi*  10  iwmo  vaiujES  IN.  «Eff«S 

Figure  5-3.  (Sheet  4  of  4) 

(3)  Final  cross-validation  was  performed,  and 
the  results,  shown  on  Figures  5-5b  and  5-5c  and 
listed  in  Table  5-1  (bedrock  B),  were  acceptable. 

f.  The  large  dilference  between  die  sill 
defined  for  the  initial  data  set  and  the  sill  for  the 
restricted  data  set  (12.54  and  8.36  m^)  supports 
the  hypothesis  that  the  original  data  set  is  actually 
two  different  domains.  The  final  variogram  then 


was  used,  along  with  the  measured  data,  to  pro¬ 
duce  ordinary  kriging  estimates  for  all  points  in  a 
52-by-52  grid  with  a  spacing  of  about  30-by-30  m, 
which  is  truncated  along  the  northeastern  border 
because  of  the  restriction  operation.  For  the  krig¬ 
ing  procedure,  a  search  radius  of  about  914  m 
witha  maximum  of  16  and  minimum  of  8  sur¬ 
rounding  locations  was  specified.  Gray-scale 
maps  of  the  kriging  estimates  and  kriging  standard 
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Table  5-2 

Univariate  Statistics  for  Gridded  Kriging  Estimates  in  Example  Applications^ 


Example 

identifier 

Transformation 

Minimum 

(Base 

units) 

Maximum 

(Base 

units) 

Mean 

(Base 

units) 

Median 

base 

units) 

Standard 
deviation 
(base  units) 

Skewness 

(dimensioniess) 

Water  level  A 

Drift 

24.34 

65.00 

45.86 

44.46 

10.15 

0.11 

Water  level  B 

Drift 

24.59 

65.00 

45.84 

44.45 

10.14 

0.11 

Bedrock  B 

None 

26.13 

64.88 

41.45 

39.78 

7.71 

0.82 

Bedrock  C 

None 

26.72 

64.39 

41.50 

39.69 

7.63 

0.82 

Water 
quality  A 

Natural  log 

2.92 

7.07 

5.17 

5.03 

0.72 

-0.06 

’Base  units  for  water  level  A  and  B  and  bedrock  B  and  C  is  feet;  base  unit  for  water  quality  A  is  iog  concentration,  concentration  in 
micrograms  per  liter. 


deviations  are  shown  in  Figures  5-6a  and  5-6b, 
respectively,  and  the  univariate  kriging  estimate 
statistics  are  listed  in  Table  5-2  (bedrock  B).  The 
kriging  results  indicate  channel-like  features  in  the 
bedrock  surface,  as  well  as  a  prominent  bedrock 
high  at  the  south  border  of  the  area;  the  results  are 
a  good  representation  of  the  results  from  other 
more  elaborate  studies. 

g.  For  an  example  of  block  kriging,  an  invest¬ 
igative  goal  of  establishing  block  values  of  bedrock 
elevation  for  a  finite-difference  groundwater  model 
grid  having  about  120-  by  120-m  cells  was 
assumed.  The  same  variogram  and  search  criteria 
were  used  to  estimate  block  values  for  a  13-by- 
13  grid  with  about  120-  by  120-m  spacing;  a 
4-by-4  block  was  specified.  Each  kriging  value 
shown  in  Figure  5-6c  is  interpreted  as  an  estimate 
of  die  average  value  of  bedrock  elevation  over  the 
about  120-  by  120-m  block.  The  standard  devi¬ 
ation  for  the  block  estimates  is  less  tiian  the  stan¬ 
dard  deviation  for  the  point  estimates  (Table  5-2). 
Gray-scale  maps  of  the  kriging  estimates  and  the 
kriging  standard  deviations  are  shown  in  Fig¬ 
ures  5-6c  and  5-6d,  and  the  univariate  kriging 
estimate  statistics  are  listed  in  Table  5-2 
(bedrock  C). 


5-4.  Water-Quality  Examples 


determined  for  a  contaminant.  The  principal  pur¬ 
poses  of  the  examples  are  to  famiharize  the  reader 
with  a  kriging  exercise  using  water-quality  infor¬ 
mation  and  to  illustrate  indicator  kriging.  The 
examples  also  will  familiarize  the  reader  with  data 
that  are  strongly  anisotropic  and  need  transforma¬ 
tion.  The  data  come  from  a  water-table  aquifer 
developed  in  alluvial  sediments  where  the  depth  to 
water  was  less  than  about  23  m.  Several  analytical 
laboratories  were  involved  in  measuring  the  con¬ 
centration  of  the  contaminant  in  the  water-quality 
examples.  Each  of  the  analytical  laboratories  was 
required  to  follow  rather  comprehensive  guidelines 
that  specified  tests  of  instrument  performance 
before  sample  determinations  were  made,  as  well 
as  measurement  of  extraction  efficiencies. 

Because  of  these  performance  guidelines,  the 
opportunity  for  errors  due  to  instrument  error  was 
considered  to  be  either  known  or  relatively  low.  In 
addition  to  using  performance  guidelines,  field 
quality-assurance  samples  were  also  collected. 
These  samples  can  be  used  to  evaluate  other  types 
of  possible  errors,  such  as  cross-contamination 
and  representativeness  of  the  sample.  Duplicate 
samples  for  the  contaminant  in  die  water-quality 
examples  indicate  as  much  as  15  percent  varia¬ 
bility  m  reported  results.  This  variability  is  not 
entirely  unusual  and  is  most  likely  related  to  the 
integrity  of  the  analytical  method  or  the  method  in 
which  the  sample  media  was  aggregated  during 
sample  collection. 


a.  The  following  examples  are  for  water- 
quality  information  consisting  of  concentrations 
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Figure  5-4.  Measured  data  for  bedrock-elevation  examples-A,  original  data  and  B,  restricted  data  (Continued) 
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Figure  5-5.  Variogram  and  variogram  cross-validation  plots  for  bedrock-elevation  example~A,  theoretical  variogram;  B,  cross-validation  scatterplot; 
C,  cross-validation  probability  plot  (Sheet  1  of  3) 


ETL  1110-1-175 
30  Jun  97 


5-17 


Figure  5-S.  (Sheet  2  of  3) 
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Figure  5-5.  (Sheet  3  of  3) 
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Figure  5-6.  (Sheet  3  of  4) 
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b.  Measured  locations  are  shown  in  Fig¬ 
ure  5-7  and  the  basic  univariate  statistics  are  listed 
in  Table  4-1  (water  quality  A).  An  initial  review 
of  the  data  indicated  three  important  features. 

(1)  The  data  seemed  to  have  strong  anisotropy 
at  about  150  counterclockwise  degrees  to  the  east- 
west  baseline. 

(2)  The  data  required  a  natural  log  transfor¬ 
mation  so  the  distribution  was  approximated  by  a 
normal  distribution. 

(3)  No  trends  were  indicated  during  prelimi¬ 
nary  exploration,  and  ordinary  kriging  was  tenta¬ 
tively  selected  as  the  appropriate  technique. 

c.  Natural  log  transformations  are  routinely 
needed  for  concentration  data  that  vary  over  sev¬ 
eral  orders  of  magnitude,  which  is  common  in 
areas  of  contaminant  plumes.  The  data  were 
transformed  to  log  space  and  fit  acceptable  criteria 
for  normality.  After  transformation  to  log  space, 
the  techniques  described  in  Chapter  4  were  used 
to  guide  tile  following  steps  for  variogram 
constraction: 

(1)  An  exponential  model  was  used  to  fit  a 
directional  variogram  at  an  angle  of  150  counter¬ 
clockwise  degrees  to  the  east-west  baseline.  The 
variogram  had  a  nugget  of  1.00  log  concentration 
squared,  a  sill  of  3.20  log  concentration  squared, 
and  a  range  of  1,295  m  [Figure  5-8a  and  Table  5-1 
(water  quality  A)]. 

(2)  An  exponential  model  was  also  fit  to  a 
directional  variogram  at  an  angle  of 240  counter¬ 
clockwise  degrees  to  the  east-west  baseline.  The 
variogram  had  a  nugget  of  1.00  log  concentration 
squared,  a  sill  of  3.20  log  concentration  squared, 
and  a  range  of 229  m  [Figure  5-8b  and  Table  5-1 
(water  quality  A)]. 

(3)  Cross-validation  was  performed  using  the 
geometric  anisotropy  of  tiie  two  variograms  and 
the  results  [Figures  5-8c  and  5-8d,  and  Table  5-1 
(water  quality  A)]  were  acceptable. 


d.  The  residuals  are  symmetrically  distribu¬ 
ted,  (Figure  5-8d).  However,  the  scatterplot  (Fig¬ 
ure  5-8c)  indicates  that  small  concentrations  are 
overestimated  and  that  large  concentrations  are 
underestimated.  This  discrepancy  in  the  estimates 
does  not  indicate  an  error  in  the  model,  but  rather, 
indicates  a  consequence  of  data  that  have  a  large 
nugget  compared  to  the  sill;  in  this  example  the 
nugget  is  approximately  30  percent  of  the  sill.  The 
large  nugget  decreases  the  predictive  capacity  of 
the  model  and  increases  tiie  smootiiing  introduced 
by  kriging. 

e.  The  established  variogram  then  was  used, 
along  with  the  measured  locations,  to  produce 
ordinary  kriging  estimates  for  all  points  in  a  40-by- 
20  grid  using  a  grid  spacing  of  about  91-by-91  m. 
For  the  kriging  procedure,  a  search  radius  of  about 
1 ,524  m  with  maximum  of  1 6  and  a  minimum  of 

8  locations  was  specified.  Gray-  scale  maps  of 
kriging  estimates,  back  transformed  to  concentra¬ 
tions  and  in  log  space,  as  well  as  the  kriging  stan¬ 
dard  deviations  in  log  space,  are  shown  in  Fig¬ 
ures  5-9a,  5-9b,  and  5-9c. 

/  The  back-transformation  procedure  was  a 
simple  exponentiation  of  the  log  space  kriging 
estimates.  Such  a  back-transformation  does  not 
use  bias-correction  factors  to  deal  with  moment 
bias  and,  consequently,  the  back-transformed 
values  must  be  interpreted  as  a  median  value 
rather  tiian  a  mean  value.  The  simple  back- 
transformation,  however,  is  convenient  and  was 
performed,  principally,  to  enhance  visual  inter¬ 
pretation  of  the  kriging  estimates.  Univariate  sta¬ 
tistics  for  the  log-space  kriging  estimates  are  listed 
in  Table  5-2  (water  quality  A).  The  kriging  results 
do  have  noticeable  smoothing;  however,  they  also 
indicate  a  plume  emanating  from  an  area  just 
northwest  of  the  center  of  the  area  and  movement, 
as  well  as  some  dispersion,  to  the  southeast;  the 
estimates  are  a  very  good  representation  of  the 
results  from  many  other  more  elaborate  studies. 

g.  An  additional  comment  concerning  log 
transformations  is  appropriate.  To  indicate  the 
effect  of  the  log  transform  on  probabilities  in 
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Figure  5-7.  Measured  data  for  water-quality  examples 
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Figure  5-8.  Directional  variograms  and  variogram  cross-validation  piots  for  water-quality  example-A,  theoretical  major-direction  variogram  (southeast); 
B,  theoretical  minor-direction  variogram  (northeast);  C,  cross-validation  scatterpiot;  D,  cross-validation  probability  plot  (Sheet  1  of  4) 
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Figure  5-8.  (Sheet  2  of  4) 
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Figure  5-8.  (Sheet  3  of  4) 
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Figure  5-9.  (Sheet  3  of  4) 
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Figure  5-9.  (Sheet  4  of  4) 


converting,  or  back-transforming,  kriging  esti¬ 
mates,  the  kriging  estimates  and  the  kriging  stan¬ 
dard  deviations,  in  log  space,  were  used  to  estimate 
the  one-sided  95th  percentile  at  each  kriging- 
estimate  location  according  to  the  formula: 


^0.95 


(5-1) 


exp 


(V 


+  1.6450 


A 

where  (Z  Xq)  ( is  the  kriging  estimate  at  location  Xq, 
in  log  space,  and  Ok(xo)  is  the  corresponding  krig¬ 
ing  standard  deviation  in  log  space.  The  resulting 
map  is  shown  in  Figure  5-9d.  Such  a  map  can  be 
used  to  indicate  areas  where  the  true  concentration 
has  only  a  5-percent  chance  of  exceeding  the  value 
shown. 


h.  To  perform  indicator  kriging,  the  indicator 
transformation,  as  described  in  Chapter  2,  was 
applied.  An  indicator  cutoff  equal  to  the  median 
value  of  270  for  the  untransformed  measured  data 
was  selected.  The  model  for  indicator  kriging  esti¬ 
mates  the  probability  that  the  concentration  would 
be  less  than  the  indicator  cutoff.  The  techniques 
described  in  Chapter  4  were  used  to  guide  the  fol¬ 
lowing  steps  in  variogram  construction: 

(1)  No  trends  were  indicated  during  prelimi¬ 
nary  exploration,  and  ordinary  kriging  was  tenta¬ 
tively  selected  as  the  appropriate  technique. 

(2)  A  spherical  model  was  used  to  fit  an 
anisotropic  variogram  at  an  angle  of  150  deg 
counterclockwise  to  the  east-west  baseline.  The 
variogram  had  a  nugget  of  0.05  indicator  units 
squared,  a  sill  of  0.25  indicator  units  squared,  and 
a  range  of  610  m  [Figure  5-lOa  and  Table  5-1 
(water  quality  B)]. 
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(3)  A  spherical  model  also  was  fit  to  an 
anisotropic  variogram  at  an  angle  of 240  deg 
counterclockwise  to  the  east-west  baseline.  The 
variogram  had  a  nugget  of  0.05  indicator  units 
squared,  a  sill  of  0.25  indicator  units  squared,  and 
a  range  of  213  m  [Figure  5-lOb  and  Table  5-1 
(water  quality  B]. 

i.  The  established  variogram,  along  with  the 
indicator  transform  of  the  measured  data,  was  used 
to  produce  ordinary  kriging  estimates  for  the  same 
grid  and  search  criteria  as  the  first  water-quality 
example.  A  gray-scale  map  of  the  kriging  esti¬ 
mates  is  shown  in  Figure  5-11.  The  kriging  indi¬ 
cator  map  provides  a  gridded  estimate  for  the 
probability  of  contaminant  values  being  less  than 
the  indicator  cutoff,  which  is  a  concentration  of 
270  in  this  example. 

j.  The  cutoff  value  selected  for  the  preceding 
indicator  kriging  example  is  probably  hi^er  than 
many  investigators  involved  in  FITRW  site  investi¬ 
gations  would  like  to  use.  In  this  case  the  number 
of  measurements  [66  in  Table  4-1  (water  qual¬ 
ity  B)]  used  in  this  example,  which  is  probably  a 
high  number  of  measurements  for  typical  HTRW 
site  investigations,  would  not  permit  constmction 
of  an  indicator  variogram  for  indicator  values 
much  lower  than  the  median.  An  alternative  to  this 
problem  would  be  to  assume  that  the  log- 
transformed  kriging  model  developed  in  the  first 
water-quality  example  is  correct  and  to  rely  on  the 
kriging  estimates  fi’om  that  model  to  determine 
areas  greater  than  or  less  than  some  indicator 
value.  The  same  estimates  also  could  be  used  to 
compute  the  probability  that  the  concentration  was 
less  than  some  arbitrarily  selected  value. 
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direction  variogram  (Continued) 
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Figure  5-10.  (Concluded) 
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Figure  5-11.  indicator  kriging  results  for  water-quality  example 
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Chapter  6 

Review  of  Kriging  Applications 


This  chapter  will  briefly  treat  three  principal  top¬ 
ics;  applicability  of  kriging  techniques,  important 
elements  that  need  to  be  addressed  in  kriging  appli¬ 
cations,  and  errors  in  measured  data.  Much  of  the 
information  presented  in  this  section  has  been 
gathered  from  other  sections  of  this  ETL  and  is 
presented  collectively  here.  The  items  identified  as 
important  to  kriging  applications  may  be  helpful  in 
assessing  kriging  applications  under  review. 

6-1 .  Applicability  of  Kriging 

a.  In  the  preceding  sections  of  this  ETL,  die 
theory  of  kriging  techniques  has  been  summarized, 
and  examples  have  been  given  to  indicate  the  util¬ 
ity  of  kriging  techniques  in  HTRW  site  investiga¬ 
tions.  The  examples  presented  were  selected  so 
that  kriging  would  provide  satisfactory  results  or 
be  applicable.  Additionally,  the  examples  were 
designed  so  that,  for  the  purposes  of  demonstra¬ 
tion,  some  sort  of  adjustment  of  the  data  was 
needed;  that  is,  drift  was  removed  or  transforma¬ 
tions  were  made. 

b.  Investigators  are  very  likely  to  have  data 
for  which,  although,  in  a  strict  sense,  kriging  may 
be  applicable,  results  may  be  unsatisfactory.  A 
good  deal  of  fundamental  information  that  may  be 
used  to  establish  how  satisfactory  application  of 
kriging  techniques  might  be  has  been  presented  in 
the  preceding  sections  of  this  ETL.  In  particular. 
Chapter  4  includes  a  detailed  discussion  on  vario- 
gram  construction,  the  preliminary  step  in  any 
kriging  application,  and  systematically  describes 
many  decisions  in  this  process  that  need  attention. 
If  the  investigator  cannot  construct  or  otherwise 
obtain  a  variogram  that  has  structure,  then  the 
results  of  a  kriging  application  may  not  be  satis¬ 
factory.  Some  additional  discussion  designed  to 
guide  investigators  in  evaluating  the  amount  of 
data  that  may  be  required  for  kriging  applications 
is  presented  in  this  section.  This  discussion  will 
assume  that  the  measured  data  are  correct;  a 


separate  and  brief  discussion  of  measurement 
errors  will  also  be  presented  in  this  section. 

c.  Many  investigators  will  have  a  tendency  to 
focus  on  the  amount  of  measured  data  that  is  avail¬ 
able  as  an  initial  consideration.  It  is  important  for 
the  investigator  to  realize  that  decisions  concerning 
the  applicability  of  kriging  techniques  cannot  be 
based  simply  on  the  amount  of  measured  data. 
However,  unless  the  investigator  is  presented  with 
a  reliable  variogram,  the  amount  and  spatial  distri¬ 
bution  of  measured  data  can  be  a  constraint.  If, 
for  instance,  the  investigator  has  fewer  than  25 
measured  values  at  optimal  locations  from  the 
field,  there  may  not  be  enough  data  to  confidently 
estimate  Gaussian  variogram  parameters,  a  smaller 
amount  of  measured  data  may  be  suitable  for  other 
variogram  models. 

d.  The  amount  of  data  needed  to  apply  krig¬ 
ing  techniques  is  not  easy  to  determine,  but  infor¬ 
mation  in  this  ETL,  especially  in  Chapter  4,  and 
the  literature  cited  can  provide  some  guidance. 
Section  4-4  points  out  tiiat  a  good  minimum  for  the 
number  of  pairs  of  locations  in  each  variogram  lag 
is  30  and  the  American  Society  for  Testing  and 
Materials  (Standard  D  5922-96)  has  suggested  that 
20  may  work  well  also.  Most  investigators  would 
probably  feel  comfortable  defining  a  Gaussian 
form  (which,  because  it  has  more  inflection,  is 
more  difficult  to  fit  compared  to  the  other  standard 
variogram  models)  with  8  to  10  optimally  located 
sample  variogram  points  (enough  points  to  define 
the  nugget,  two  areas  of  curvature,  and  the  sill).  In 
this  ideal  case,  about  25  measured  values  would  be 
needed  to  fulfill  the  conservative  minimum  of 

30  pairs  per  lag.  In  this  case,  the  relatively  few 
measured  data  points  need  to  be  systematically 
located  so  that  the  optimally  located  variogram 
points  can  be  computed.  If  the  measured  data  were 
not  located  systematically,  as  is  usually  the  case, 
then  more  measured  data  would  be  needed. 

e.  Once  sample  variogram  points  meeting  the 
required  nmnber  of  pairs  of  locations  can  be 
defined,  the  investigator  needs  to  have  a  resulting 
variogram  that  has  structure.  The  variogram,  for 
instance,  may  simply  exhibit  noise  about  a 
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horizontal  line  and  have  no  structure.  If  measured 
data  are  clustered  and  the  lags  have  been  mini¬ 
mized  to  meet  the  required  number  of  pairs  of  loca¬ 
tions,  the  variogram  may  seem  horizontal  because 
it  is  dominated  by  small-scale  effects  in  the 
clustered  data.  The  investigator  then  has  latitude 
to  adjust  the  lags  and  attempt  to  balance  the  lag 
spacing  and  required  number  of  pairs  of  locations, 
as  described  in  section  4-4.  However,  the  vario¬ 
gram  could  also  seem  horizontal  because  the  actual 
sill  is  reached  within  a  very  small  lag.  If  that  lag  is 
smaller  than  the  minimum  spacing  of  measured 
data,  obtaining  structure  in  &e  variogram  would 
not  be  possible.  If  the  investigator  has  a  vario¬ 
gram  with  no  structure,  the  measured  data  need  to 
be  considered  independent,  and  kriging  techniques, 
at  the  lag  of  the  measured  data,  would  be  ineffec¬ 
tive  or  at  least  offer  little  advantage  over  other 
interpolation  techniques. 


6-2.  Important  Elements  of  Kriging 
Appiications 

a.  Many  important  elements  of  kriging  appli¬ 
cations  have  been  discussed  in  this  ETL.  These 
discussions  have  been  presented  as  a  systematic 
and  sequential  method  designed  to  provide  guid¬ 
ance  in  kriging  applications.  Occasionally,  an 
investigator  will  be  presented  with  the  results  of  a 
previous  kriging  application  and  will  need  to  eval¬ 
uate  the  application  before  deciding  whether  or  not 
to  use  the  results.  This  section  presents  a  brief 
review  of  some  important  elements  of  kriging 
applications  that  such  an  investigator  may  use  in 
that  evaluation.  For  a  more  detailed  discussion  of 
important  elements  of  geostatistical  applications, 
the  reader  is  referred  to  the  American  Society  of 
Testing  and  Materials  (Standard  D  5549-94)  for 
content  of  geostatistical  investigations. 

b.  The  presence  of  or  lack  of  stationarity  in 
the  spatial  mean  needs  to  be  demonstrated  defini¬ 
tively.  If  the  spatial  mean  is  nonstationary,  then 
drift  is  indicated  and  appropriate  measures  to 
establish  stationarity,  which  are  similar  to  the 
measures  presented  in  section  4-3,  need  to  be  part 
of  the  j^plication.  In  ideal  situations. 


nonstationarity  occurs  as  a  gradual  change. 

HTRW  site  investigations  may  present  cases, 
especially  when  dealing  with  water-  quality  data  in 
and  aroimd  plumes,  that  have  abrupt  step-like 
changes  at  plume  boundaries  and  do  not  appear  as 
regional  drift.  In  these  cases  the  investigator  needs 
to  be  aware  that  without  knowledge  of  the  plume 
boundaries,  points  from  within  the  plume  will  be 
grouped  with  points  from  outside  the  plume  in 
computing  the  sample  variogram.  The  effect  of  this 
problem  is  minimized  as  long  as  the  investigator 
can  define  lags  that  allow  data  points  within  the 
plume  to  be  grouped  together. 

c.  The  construction  of  the  variogram  needs  to 
be  described.  The  description  needs  to  address  the 
number  of  pairs  of  locations  in  each  variogram  lag 
and  to  demonstrate  that  the  variogram  has  struc¬ 
ture.  A  plot  of  the  variogram  is  helpful  to  demon¬ 
strate  the  presence  or  absence  of  structure.  The 
variogram  construction  discussion  also  needs  to 
establish  the  presence  of  or  lack  of  isotropy.  If 
anisotropy  is  present,  its  nature  needs  to  be  estab¬ 
lished,  and  it  needs  to  be  addressed  by  variogram 
adjustments  similar  to  the  adjustments  presented  in 
section  4-5&. 

d.  The  variogram  cross-validation  statistics 
described  in  section  4-9  are  useful  and,  if  avail¬ 
able,  (hey  can  aid  in  the  evaluation  of  a  kriging 
application;  authoritative  and  definitive  kriging 
applications  should  include  cross-validation. 
Cross-validation  statistics  need  to  conform  to  the 
guidelines  discussed  in  section  4-9.  Section  4-9h 
indicates  that  the  cross-validation  exercise  needs  to 
balance  minimizing  the  kriging  cross-validation 
errors  with  efforts  to  guard  against  bias.  Also,  as 
discussed  in  section  4-9h,  if  probabilistic  state¬ 
ments  are  part  of  the  kriging  application,  there 
needs  to  be  some  demonstration  about  the  normal¬ 
ity  of  the  reduced  kriging  error  such  as  the  cross- 
validation  probability  plots  included  with  the 
examples  in  Chapter  5. 

e.  Maps  of  the  kriging  estimates  and  standard 
deviations  need  to  be  presented  or  discussed.  The 
maps  of  kriging  estimates  need  to  conform  to  any 
qualitative  information  about  the  information 
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portrayed  on  the  maps  fliat  is  available  to  the 
investigator.  The  maps  of  kriged  standard  devia¬ 
tions  can  be  used  to  determine  where  there  are 
large  areas  of  uncertainty  in  the  kriging  estimates. 

f.  Finally,  the  variogram  and  kriging  algo¬ 
rithms  are  most  useful  as  interpolation  rather  than 
extrapolation  tools.  Once  the  application  extends 
to  areas  beyond  the  geographic  extremes  of  the 
measured  data,  or  perhaps  those  extremes  plus  the 
range,  there  needs  to  be  some  qualification  of  the 
area  of  extrapolation.  For  instance,  in  universal 
kriging,  the  practitioner  would  need  to  have  some 
assurance  that  the  conditions  of  drift  delRned  in  the 
study  area  continue  into  the  area  of  extrapolation. 

6-3.  Errors  in  Measured  Data 

a.  Data  associated  with  HTRW  site  investi¬ 
gations  have  the  same  opportunities  for  errors  that 
most  investigations  do.  The  errors  may  involve, 
among  others,  bias,  inaccuracy,  or  lack  of  repre¬ 
sentativeness.  The  classical  nature  of  these  errors 
is  described  in  EM  200-1-2,  “Technical  Project 
Planning,”  (U.S.  Army  Corps  of  Engineers  1995), 
which  describes  HTRW  data-quality  design. 

b.  The  presence  of  contamination  may  com¬ 
plicate  the  function  of  errors  in  HTRW  site 
investigations.  Because  these  investigations  often 


concern  contamination,  there  can  be  large  ranges 
of  values  for  data  involving  contaminant  concen¬ 
trations,  and  these  large  ranges  have  a  tendency  to 
increase  the  incidence  of  data  that  may  seem  to  be 
statistical  outliers.  Even  more  complicating  is  the 
presence  of  high  concentrations  of  organic  mate¬ 
rials  that  may  create  challenging  analj^ical  prob¬ 
lems  in  laboratory  determinations  that  also  may 
lead  to  reported  values  that  seem  to  be  statistical 
outliers.  In  either  case,  the  kriging  practitioner  is 
likely  to  find  ftiat  the  apparent  outliers  have  a 
strong  effect  on  the  results  of  the  kriging 
application. 

c.  When  HTRW  site  investigations  find  data 
that  seem  to  be  outliers,  the  data  need  to  be  very 
carefully  evaluated  before  removal  is  seriously 
contemplated.  Automated  outlier  detection  tools, 
as  suggested  in  section  4-8,  may  best  be  used  to 
identify  points  that  may  be  outliers  and  warrant 
further  investigation.  Often  data  that  appear  to  be 
outliers  may  be  the  most  important  and  meaningful 
data  of  all  measurements.  For  example,  in  the  first 
case  described  in  the  preceding  paragraph,  appar¬ 
ent  outliers  often  are  representative  values.  In  the 
second  case,  the  reported  value  may  be  an  errone¬ 
ous  determination  that  has  been  affected  by  the 
extremely  contaminated  nature  of  the  sample 
matrix.  The  investigator  needs  to  either  possess  or 
have  access  to  qualitative  or  institutional  knowl¬ 
edge  that  will  aid  in  outlier  interpretation. 
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Chapter  7 

Other  Spatial  Prediction  Techniques 


7-1.  General 

a.  In  this  chapter,  some  alternative 
approaches  to  spatial  prediction  are  discussed.  At 
the  beginning  of  Chapter  2,  the  distinction  between 
stochastic  and  nonstochastic  techniques  for  spatial 
prediction  was  discussed.  Kriging,  the  main  sub¬ 
ject  of  this  ETL,  is  a  stochastic  technique  because 
of  the  structure  that  is  imposed  in  terms  of  an 
underlying  random  process  (the  regionalized 
variables)  with  joint  probability  distributions  that 
obey  certain  assumptions.  Kriging  yields  the 
predictor  that  is  statistically  optimal  in  the  sense 
that  it  is  die  best  linear  unbiased  predictor,  given 
certain  assumptions  that  are  detailed  in  Chapter  2. 
There  are  other  stochastic  techniques  that  are  less 
well-known  than  kriging  in  applications,  such  as 
Markov-random-field  prediction  and  Bayesian 
nonparametric  smoothing  (see  Cressie  (1991)),  but 
these  will  not  be  discussed  here. 

b.  Several  techniques  that  are  often  applied 
in  a  nonstochastic  setting  will  be  discussed.  Tech¬ 
niques  applied  in  such  a  setting  are  typically 
applied  strictly  empirically  and  not  evaluated  with 
respect  to  rigorous  statistical  criteria  such  as  mean 
squared  prediction  error,  although,  as  discussed  in 
Chapter  2,  such  criteria  may  be  applied  in  certain 
of  the  techniques  such  as  simple  average  and  trend 
analysis.  It  has  been  shown  in  this  ETL  that  there 
are  some  compeUing  advantages  for  assuming 
some  kind  of  stochastic  setting.  However,  the  sim¬ 
plicity  of  not  having  to  postulate  and  justify  the 
structure  and  assiunptions  inherent  in  stochastic 
analyses  might  be  considered  one  advantage  of 
nonstochastic  techniques,  and  such  an  analysis 
may  be  perfectly  adequate  for  certain  problems.  In 
addition  to  statistical  optimality  and  simplicity, 
there  are  other  considerations  in  selecting  a  spatial 
prediction  technique,  such  as  ease  of  computation, 
sensitivity  to  data  errors,  and  whether  the  predic¬ 
tors  are  exact  interpolators;  that  is,  match  the  mea¬ 
surements  exactly  at  the  measurement  locations  x,, 
X2,—.  in-  The  last  property  is  one  that  needs  to  be 


given  careful  consideration  by  the  practitioner. 
Kriging,  as  it  is  usually  applied,  is  an  exact  inter¬ 
polator.  Questions  may  be  raised,  however,  about 
whether  this  is  a  desirable  property  if  it  is  known 
that  the  measurements  are  contaminated  wifli  a 
considerable  amount  of  measurement  error.  One 
advantage  of  stochastic  methods  in  general  is  that 
existence  of  measurement  error  may  be  incorpo¬ 
rated  objectively,  and,  in  fact,  some  krigiag  soft¬ 
ware  packages  (including  STATPAC)  have  this 
feature,  resulting  in  a  surface  that  is  not  an  exact 
interpolator.  Several  of  the  nonstochastic  methods 
discussed  in  this  section  depend  on  a  parameter  that 
controls  the  deviation  from  exact  interpolation. 

The  ability  to  adjust  such  a  parameter  when  using 
these  techniques  lends  a  degree  of  flexibility  to  the 
practitioner,  but  selecting  the  best  value  may  not  be 
straightforward  and  may  involve  considerable 
subjectivity  on  the  part  of  the  practitioner. 

c.  In  most  of  flie  following  techniques,  the 
predictor  of  the  process  at  location  Xo  takes  the 
form  of  a  linear  combination  of  thejneasurements 
at  locations  /=1,  2,...,  n.  Using  Z  (xo)  to  denote 
an  arbitrary  predictor  (the  notation  distinguishes 
the  predictors  to  be  discussed  in  this  section  from 
the  kriging  predictor,  which  is  denoted  by  Z  (Xq), 
the  definition  of  Z  (^)  is 

%)  =  E  (2.)  (7-1) 

i  =  1 

Alfliough  this  form  is  the  same  form  that  is  taken 
by  the  kriging  predictor,  the  difference  is  in  the  way 
the  coefficients  w,  are  computed. 

7-2.  Global  Measure  of  Central 
Tendency  (Simple  Averaging) 

a.  The  predictor  for  the  process  at  any 
location  Xo  is  file  simple  average  of  the  measure¬ 
ments;  that  is,  file  weights  w,  are  all  equal  and  are 
given  by  Cressie  (1991) 
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This  predictor  represents  the  smoothest  possible 
predictor  surface.  In  using  this  predictor,  a  certain 
degree  of  spatial  homogeneity  is  assumed.  No 
attempt  is  made  to  incorporate  any  detectable 
patterns  (or  trends)  in  the  mean  or  variance  of  the 
data  as  a  function  of  location,  and  the  fact  that 
measurements  made  at  points  that  are  close  to  each 
other  may  be  related  is  disregarded.  Such  a  pre¬ 
dictor  has  the  advantage  of  being  very  simple  to 
compute;  it  needs  no  estimation  of  a  variogram  or 
other  model  parameters.  The  disadvantage  is  that 
representing  the  spatial  field  by  a  single  value 
ignores  much  of  the  relevant  and  interesting  strac- 
ture  that  may  be  very  helpful  in  improving 
predictions. 

b.  As  discussed  in  section  2-4,  if  applied  in  a 
stochastic  setting,  diis  predictor  would  be  optimal 
(best  linear  unbiased)  if  fliere  is  no  drift  and  if 
residuals  are  uncorrelated  and  have  a  common 
variance. 


^  =  1,  the  predictor  is  an  exact  interpolator  and  is 
constant  on  the  Voronoi  polygons  (see  section  7-5) 
induced  by  the  measurement  locations. 

c.  There  are  several  variations  of  this  pre¬ 
dictor.  In  one  such  variation,  a  distance  r  may  be 
fixed  (rather  than  fixing  k)  and  averages  over  loca¬ 
tions  that  are  within  distance  r  ofxo  taken.  Addi¬ 
tionally,  a  moving-median  may  be  used  rather  than 
a  moving  average.  Sorting  and  testing  distances 
can  slow  computations  relative  to  obtaining  the 
simple  average,  and  use  of  medians  rather  than 
means  leads  to  a  more  resistant  (to  outliers) 
predictor. 

7-4.  Inverse-Distance  Squared  Weighted 
Average 

a.  The  weights  w,  are  (Joumel  and  Huijbregts 
1978) 


7-3.  Simple  Moving  Average 

a.  Let  hjo  be  the  distance  of  Xq  fi'om  Xj,  let  h[,o] 
be  the  ordered  (fi-om  smallest  to  largest)  distances, 
and  fix  1  ^  A:  s  «.  Then  the  weights  w,  are 
(Cressie  1991) 


hi 


w,  = 


j-i  h;, 


70 


(7-4) 


where  again  is  the  distance  ofxo  fi'om  x,. 


w,  = 


^iO  ^  ^[tO] 
0,  A,o  > 


(7-3) 


Thus,  tiiis  predictor  is  the  average  of  the  measure¬ 
ments  at  die  k  nearest  locations  from  xq. 

b.  If  k  is  equal  to  n,  this  predictor  is  identical 
to  the  simple  average,  with  weights  as  given  in 
Equation  7-2.  A  choice  of  k  smaller  than  n  reflects 
an  assumption  that  the  predictor  needs  to  incor¬ 
porate  more  of  the  local  fluctuation  observed  in  the 
data,  or,  equivalently,  that  measurements  at  loca¬ 
tions  near  Xo  should  be  more  informative  than 
measurements  at  other  locations  in  predicting  z(xo); 
the  smaller  k  is,  the  more  variable  the  predictor.  If 


b.  In  the  simple  moving  average,  weights  are 
the  same,  provided  measurement  locations  are 
sufficiently  close  to  the  prediction  location  and  are 
zero  otherwise.  For  the  inverse-distance  squared 
method,  weights  are  forced  to  decrease  in  a 
smoother  maimer  as  distance  from  the  prediction 
location  increases.  This  predictor  again  has  the 
advantage  of  being  easy  to  compute.  Another 
feature  of  this  predictor  is  that  it  is  an  exact  inter¬ 
polator.  In  addition,  the  exponent  2  of  hi^  may  be 
changed  to  any  positive  number,  giving  the  user 
some  flexibility  in  determining  the  rate  of  decrease 
of  weights  as  a  function  of  distance  from  Xo.  Isaaks 
and  Srivastava  (1989,  pp.  257-259)  present  an 
example  illustrating  the  effects  on  weights  of 
changing  the  exponent. 
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7-5.  Triangulation 

a.  To  compute  this  predictor,  the  region  R  is 
partitioned  into  what  are  referred  to  as  Voronoi 
polygons  K],  V2,...,  V„,  with  Vi  being  the  set  of 
locations  closer  to  measurement  location  than  to 
any  other  measurement  location.  If  any  two  poly¬ 
gons,  Vi  and  Vj,  share  a  common  boundary,  jc,  and 
^  are  joined  with  a  straight  line.  The  collection  of 
dl  such  lines  defines  what  is  known  as  the 
Delaunay  triangulation.  There  will  be  one  such 
triangle  containing  the  prediction  location  the 
vertices  of  this  triangle,  which  are  measurement 
locations,  are  labelled  x^,  and  x,.  The  spatial 
prediction  at  x^  will  be  the  planar  interpolant 
through  the  coordinates  (Xj,  z(Xj)),  zfxi)),  and 
(xi_  z(X|)).  Joining  Xq  and  Xj,  Xi„  and  j,,  three  sub¬ 
triangles  are  formed.  The  weights  w,-  are  (Cressie 
1991) 


A. 


Aj  +  Ak+  Aj 


,  i  =  j,  k,  or  I 


(7-5) 


0, 


otherwise 


where  Ai  is  the  area  of  the  subtriangle  opposite 
vertex  x-. 


b.  These  definitions  are  illustrated  in  Fig¬ 
ure  7-1.  In  this  figure,  the  dashed  lines  depict  the 
Voronoi  polygons  associated  with  points  Xy,  X.2, 

X6,  and  the  solid  lines  define  the  Delaunay  triangu¬ 
lation.  Vertices  of  the  triangle  containing  the  pre¬ 
diction  point  io  are  ii,  Xs,  and  X6>  and  dotted  lines 
show  the  subtriangles  defining  the  associated  area 
Ai,Aj,  A^.  Forlhisexample,7,  A,and/inthe 
general  Equation  7-5  are  1,  5,  and  6,  so  the 
weights  assigned  to  points  Xi,  is,  and  Xe  are, 
respectively. 


It  is  seen  that  the  weight  assigned  to  a  point  is  pro¬ 
portional  to  the  area  of  the  triangle  opposite  the 
point. 


c.  Computation  of  this  predictor  is  slower 
than  computation  of  those  in  sections  7-2, 7-3,  and 
7-4.  The  predictor  is  an  exact  interpolator,  and  the 
surface  produced  is  continuous,  but  not  differen¬ 
tiable  at  the  edges  of  the  triangulation. 


7-6.  Splines 

a.  In  spline  modeling,  the  measurements  are 
interpolated  using  combinations  of  certain  so-called 
basis  fimctions.  These  basis  fimctions  are  usually 
taken  to  be  piecewise  polynomials  of  a  certain 
degree,  say  k,  which  is  determined  by  the  user.  The 
coefficients  of  these  polynomials  are  chosen  so  that 
the  fimction  values  and  the  first  i-1  derivatives 
agree  at  the  locations  where  they  join.  The  larger  k 
is,  the  smoother  will  be  the  prediction  surface. 
Spline  techniques  are  often  applied  in  a  non¬ 
stochastic  framework;  in  such  a  context  they 
represent  a  way  of  fitting  a  surface  with  certain 
smoothness  properties  to  measurements  at  a  set  of 
locations  with  no  explicit  consideration  of  statisti¬ 
cal  optimality.  There  is,  however,  a  considerable 
body  of  work  in  which  this  technique  is  appUed  in  a 
stochastic  setting.  Splines  may  be  used,  for 
example,  in  nonparametric  regression  estimation 
problems  (Wegman  and  Wright  1983). 

b.  A  typical  approach  to  formulating  a  spline 
problem  is  to  pose  it  as  an  optimization  problem. 

In  one  special  case,  it  is  assumed  that  the  first  two 
derivatives  of  the  prediction  surface  exist,  which  is 
a  way  of  imposing  a  certain  degree  of  smoothness, 
and  that  the  spline  function  minimizes 


w, 


Ai  +  A^  +  A^ 
"^1  ^5  ^6 


,  and 


~  T,  (i,.)  -  2  +  Tig  (7-7) 


(7-6) 


7-3 


where  0  is  a  term  that  depends  on  the  first  two 
derivatives  of  the  predictor  surface.  The  parame¬ 
ter  T]  is  a  nonnegative  number  that  needs  to  be 
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Figure  7-1.  Diagram  showing  Voronoi  poiygons 


specified  by  the  user;  the  value  of  dus  parameter 
reflects  the  trade-off  between  goodness  of  fit  to  the 
data,  measured  by  the  first  term,  and  smooflmess, 
as  measured  by  Q.  If  rj  is  chosen  to  be  0,  the 
spline  is  an  exact  interpolator  and  passes  through 
all  the  data  points.  If  t)  >  0,  the  spline  is  not  an 
exact  interpolator.  (Splines  that  are  not  exact 
interpolators  are  referred  to  as  smoothing  splines.) 
There  are  a  number  of  numerical  procedures  that 
may  be  used  for  fitting  splines,  but  allowing  the 


smoothing  parameter  ti  to  be  >  0  renders  the 
computational  problem  more  complex. 

c.  Under  some  conditions  a  solution  to  the 
optimization  problem  (Equation  7-7)  may  also  be 
obtained  by  a  kriging  algorithm  if  the  smoothing 
parameter  T|  is  taken  to  be  equal  to  the  variance  of 
measurement  error  and  if  a  special  form  is  chosen 
for  the  covariance  function.  Therefore,  in  this 
situation,  spline  approximation  is  a  special  type  of 
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kriging.  However,  the  variogram  that  needs  to  be 
used  in  the  kriging  equations  to  make  the  kriging 
predictor  equivalent  to  the  spline  predictor  is 
determined  by  flie  basis  functions  selected  for  the 
spline.  Because  the  type  of  basis  functions  used  is 
subjective  on  the  part  of  the  user,  the  resulting 
equivalent  variogram  may  not  be  representative  of 
the  true  variogram  of  the  data.  Because  kriging 
uses  the  data  to  indicate  reasonable  variogram 
choices,  kriging  has  an  important  advantage  over 
splines.  Another  advantage  of  placing  the  problem 
in  the  kriging  fiamework  is  the  interpretation  of  the 
smoothing  parameter  in  terms  of  measurement 
errors.  In  many  cases,  an  objective  estimate  of  the 
magnitude  of  measurement  error  can  be  obtained. 
The  connections  between  kriging  and  splines  are 
discussed  further  by  Wegman  and  Wright  (1983), 
Watson  (1984),  and  Cressie  (1991). 

7-7.  Trend-Surface  Analysis 

a.  Trend-surface  analysis  is  the  process  of 
fitting  a  function,  such  as  diat  in  Equation  2-43, 
using  least  squares  to  determine  the  coefficients 
that  yield  the  best  fit.  Computationally,  trend- 
surface  analysis  is  equivalent  to  universal  kriging 
with  an  assumption  fliat  the  Z*(^  are  imcorre- 
lated.  Thus,  there  is  no  need  to  estimate  a  vario¬ 
gram,  and  readily  available  regression  packages 
may  be  used  for  estimating  the  coefficients.  As  in 
universal  kriging,  polynomial  surfaces  are  the  most 
commonly  used. 

b.  When  trend  surfaces  are  applied  in  a  sto¬ 
chastic  setting,  the  resulting  predictor  will  be  opti¬ 
mal  if  deviations  from  the  surface  are  uncorrelated 
and  have  a  common  variance. 


7-8.  Simulation 

a.  Consider  again  a  regionalized  random 
variable  Z(^,  where  x  is  a  location  in  a  two- 
dimensional  study  region  R.  Kriging  is  an  inter¬ 
polation  algorithm  that  yields  spatial  predictions  Z 
that  are  best,  or  optimal,  in  the  sense  that  has 
been  disctissed  at  some  lengdi  in  this  ETL.  The 


mean-squared  prediction  error  is  smallest  among 
all  predictors  that  are  linear  in  the  measurements. 
This  optimality  property  is  local,  in  that  the  mean- 
squared  error  of  predictions  at  unsampled  locations 
considered  one  at  a  time  is  minimized,  without 
specific  regard  to  preservation  of  global  spatial 
features.  If,  however,  the  actual  realization  z(^ 
could  be  compared  to  the  kriged  prediction  surface 
based  on  n  measured  values,  the  kriged  surface 
would  be  much  smoother  than  the  actual  surface, 
especially  in  regions  of  sparser  sampling.  Thus, 
the  kriged  siuface  will  be  a  good  and  realistic 
representation  of  reality  in  the  sense  that  the  n 
measured  values  are  honored,  but  it  will  be  less 
realistic  with  respect  to  global  properties,  such  as 
overall  variability. 

b.  The  purpose  of  simulation  is  to  produce 
one  or  more  spatial  surfaces  (realizations)  fliat  are 
more  realistic  in  preserving  global  properties  flian 
the  surface  produced  by  interpolation  algorithms, 
such  as  kriging.  These  realizations  are  produced 
by  using  numbers  that  are  drawn  randomly  (Monte 
Carlo)  to  impart  variability  to  the  simulated  sur¬ 
face,  making  the  simulated  surface  more  realistic  in 
preserving  the  overall  appearance  of  the  actual 
surface.  Generally  spe^ng,  simulation  uses  the 
idea  that  the  true  value  of  a  random  surface  may  be 
expressed  as  the  sum  of  a  predicted  value  (which  is 
obtained  by  kriging)  plus  a  random  error,  which 
varies  spatially  and  depends  on  the  random 
numbers  drawn.  Generally  a  number  of  indepen¬ 
dent  realizations  will  be  generated,  and  these 
realizations  will  be  taken  to  be  equally  probable 
representations  of  reality. 

c.  A  simulation  algorithm  is  said  to  be  condi¬ 
tional  if  the  resulting  realizations  agree  with  the 
measurements  at  measurement  locations  Xj,  ..., 
x„.  If  the  underlying  process  Z(x)  is  assumed  to  be 
Gaussian  (or  if  a  transformation  may  be  found  that 
makes  the  process  Gaussian),  the  most  common 
method  of  conditional  simulation  is  known  as 
sequential  Gaussian  simulation  (Deutsch  and 
Joumel  (1992),  pp.  141-143).  Another,  more  com¬ 
plicated,  Gaussian  simulation  method  that  is  par¬ 
ticularly  useftil  for  three-dimensional  simulations 
because  of  its  computational  efficiency  is  the 
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turning-bands  method  (Deutsch  and  Joumel  1992, 
Joumel  and  Huijbregts  1978). 

d.  In  sequential  Gaussian  simulation  a  set  of 
grid  points  for  which  simulated  values  are  desired 
is  defined  and  the  points  are  addressed  sequentially 
from  location  to  location  along  a  predetermined 
path.  At  each  location,  a  specified  set  of  neighbor¬ 
ing  conditioning  data  is  retained,  including  the 
original  data  and  simulated  grid-location  values  at 
previously  traversed  grid  locations  along  the  path. 
Then,  a  random  number  is  generated  from  a 
Gaussian  distribution  with  conditional  mean  and 
variance  determined  using  a  kriging  algoriflim,  and 
the  value  of  the  random  number  determines  the 
simulated  process  at  this  location.  The  conditional 
Gaussian  ^stribution  used  in  simulation  is  identi¬ 
cal  to  the  conditional  distribution  discussed  in 
section  2-6fe.  An  idea  of  the  computational 
requirements  can  be  obtained  from  the  fact  that  a 
kriging  algorithm  needs  to  be  applied  for  each 
simulation  location.  For  multiple  realizations,  if 
the  path  connecting  tiie  grid  points  is  kept  the 
same,  the  kriging  equations  need  to  be  solved  for 
only  the  first  simulation.  However,  implementa¬ 
tion  of  this  procedure  needs  to  take  into  considera¬ 
tion  the  assumptions  concerning  the  existence  of 
drift;  the  details  of  such  an  implementation  are 
beyond  the  scope  of  this  ETL. 

e.  A  sequential  algoridim  like  this  may  also 
be  applied  in  the  context  of  indicator  kriging  (see 
section  2-6c).  At  each  grid  point  along  fee  path,  a 
(Bemouli)  random  variable  taking  on  only  two 
possible  values,  0  or  1,  is  generated,  wife  fee  rela¬ 
tive  probability  of  these  two  values  being  deter¬ 
mined  by  indicator  kriging  applied,  as  in  fee 
previous  paragraph,  to  fee  original  observed  indi¬ 
cator  data  and  fee  previously  simulated  indicator 
values. 

f.  To  get  an  idea  of  how  simulation  results 
mi^t  be  used  in  a  risk-assessment  setting,  assume 
again  feat  fee  underlying  process  is  Gaussian  and 
feat  1,000  conditional  realizations  have  been 
generated.  If  a  single  grid  point  Xq  (which  is  not  a 
measurement  point)  is  considered,  then  fee  simu- 
ation  has  produced  1,000  values  at  ip.  which. 


when  analyzed  in  histogram  form,  approximates 
fee  probability  distribution  of  potential  measure¬ 
ments  at  feat  location.  If  an  interval  wife  exactly 
25  (2.5  percent)  of  fee  values  less  than  fee  lower 
end  and  25  of  fee  values  larger  than  fee  upper  end 
were  constructed,  fee  interval  would  almost  corre¬ 
spond,  as  e}q)ected,  to  fee  95-percent  prediction 
interval  to  Z(xo)- 1.960;^  (xp)  to  Z  (2^)  +  1.96ajf 
(xp)  discussed  in  section  2-6i.  Thus,  for  this  single 
location,  fee  simulation  has  not  produced  much 
more  information  than  kriging  alone  would  have 
produced.  The  real  value  of  simulation,  however, 
is  feat  realizations  not  just  at  a  single  location,  but 
at  all  of  fee  grid  locations  jointly,  are  obtained. 
These  realizations  can  be  used  to  calculate  proba¬ 
bilities  associated  wife  any  number  of  spatial  loca¬ 
tions  together.  For  example,  fee  probability  feat 
fee  largest  (maximum)  contaminant  value  over  a 
certain  subregion  is  greater  than  a  particular  con¬ 
centration  might  be  assessed.  (If  fee  word  “larg¬ 
est”  here  were  replaced  wife  “average,”  then  block 
kriging  could  be  used  to  obtain  fee  answer.) 

g.  A  central  point  that  needs  to  be  empha¬ 
sized  is  feat  simulation  is  especially  useful  when 
probabilities  associated  wife  complicated,  usually 
nonlinear,  functions  of  fee  regionalized  variables 
over  a  region  need  to  be  analyzed.  The  maximum 
function  mentioned  in  fee  preceding  paragraph  is 
one  simple  example.  For  another  example,  con¬ 
sider  fee  problem  of  determining  placement  of 
groimdwater  monitoring  wells  to  detect  and  moni¬ 
tor  groundwater  contamination  emanating  from  a 
potential  point  source.  Given  an  existing  set  of 
hydraulic-head  data,  kriging  might  be  applied  and 
flow  lines  determined  from  resulting  hydraulic- 
head  gradients.  Intersection  of  fee  flow  line  from 
fee  point  source  wife  fee  regional  boundary  feen 
might  be  used  to  determine  monitoring  well  place¬ 
ment.  Conditional  simulation  would  be  useful  to 
determine  uncertainty  associated  wife  location  of 
well  placement  or  to  give  an  indication  of  how 
many  monitoring  wells  might  be  appropriate.  In 
this  case,  fee  variable  of  interest,  well  location,  is  a 
complicated  function  of  hydraulic  heads  so  this  is  a 
problem  for  which  simulation  is  well-suited.  The 
reader  may  refer  to  Easley,  Borgman,  and  Weber 
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(1991)  for  a  more  detailed  discussion  of  this  type 
of  application. 

h.  The  complicated  functions  of  interest  in 
groundwater  studies  often  involve  physically  based 
groundwater  flow  models.  Conditional  simulation 
may  be  used,  for  example,  to  generate  a  suite  of 


hydraulic-conductivity  realizations  to  be  used  as 
input  to  a  model  that  produces  as  output  a  set  of 
corresponding  hydravJic-head  realizations.  Weber, 
Easley,  and  Englund  (1991)  discuss  how  ground- 
water  modeling  might  be  used  with  conditional 
simulation  to  study  the  monitoring-well-placement 
problem  discussed  in  the  preceding  paragraph. 
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a  Angle  for  directional  variogram 

b  Slope  of  variogram 

c  Generic  constant  used  for  cutoff  value 
in  probability  distribution  or  indicator 
transformation 

e  Kriging  error 

e  Reduced  kriging  error 

/  Explanatory  variables  used  in  drift 
equations 

g  Nugget  of  variogram 
h  Lag  or  distance  between  two  data  points 
n  Number  of  data  points 
m  Number  of  locations  in  a  given  block 
r  Range  of  variogram 

s  Sill  of  variogram 

w  Weight 

x(UyV)  Location  in  terms  of  coordinates  u  and  v 
z(^  Measurement  of  Z  at  location  x 
z(xi)  Kriging  estimate  using  measured  data 
A  Area  of  triangle 

B  Area  designation  in  block  kriging 

C  Population  covariance  function 

A 

C  Sample  covariance  fimction 


Covariance  of  data  values  at  locations 
and  X2 

Dg  Difference  in  values  between  data 
points  i  and  j 

E  Expectation 

I(.)  Indicator  function 

K  Number  of  variogram  bins 

N(. )  Number  of  squared  differences  in 
variogram  bin 

P  Probability 

Sample  variance  on  n  observations 

V  Voronoi  polygon 

Var  Population  variance 

Co-kriging  random  variable  at 
location  x 

Y(^  Transformed  variable  at  location  x 

Z  Regionalized  random  variable 

Z(^  Potential  value  of  Z  at  location  x 

Z(x)  Predictor  or  estimate  of  Z  at  location 

X,  obtained  from  kriging 

Z*(x)  Residuals  of  Z(^ 

A 

Z(^)  Arbitrary  predictor  of  Z  at  location  x 

Sample  mean  of  n  observations 

p  Regression  coefficient  used  in  polyno¬ 

mial  representation  for  drift 
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Y  Sample  variogram 

Y  Theoretical  variogram 

y(h)  Theoretical  variogram  for  lag  h 
X  Optimization  coefficient 
11  Parameter  used  in  spline  analysis 
p  (h)  Correlation  function  as  function  of  h 


a(x)  Spatial  standard  deviation  at  location 

X 

a^(x)  Spatial  variance  at  location  x 

Kriging  standard  deviation  at 
location  x 

a\{x)  Kriging  variance  at  location  x 
|i  Spatial  mean  at  location  x 
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